// PERSON

Jianke Zhang

ROLE RESEARCHERMENTIONS 1LAST SEEN APRIL 21, 2026

// BIO

Jianke Zhang is a researcher at the Institute for Interdisciplinary Information Sciences (IIIS) at Tsinghua University, where he works on robot learning, vision-language models, and multimodal learning. He is best known for his work on Vision-Language-Action models, including UP-VLA, a unified understanding and prediction model for embodied agents, and VLM4VLA, an empirical framework for benchmarking vision-language models as backbones for robotic policies. His research has been published at major machine learning and robotics conferences including ICML, NeurIPS, CoRL, and ICLR.

// RECENT MENTIONS

// SIGNALS

1 SIGNAL

mention·arXiv Physical AI·APRIL 21, 2026

“we initialize a VLA from the resulting VLMs and fine-tune it following the VLA training pipeline of VLM4VLA (Section 5.1). Understanding EmbodiedMidtrain requires understanding VLM4VLA — they are tightly coupled.”

Source→

AI-extracted from podcast / newsletter / paper summaries. May contain errors.