Wenyao Zhang
Wenyao Zhang is a final-year PhD student in the joint program between Shanghai Jiao Tong University and Eastern Institute of Technology, Ningbo, supervised by Wenjun Zeng and Xiaokang Yang. His research focuses on robot learning, representation learning, and multimodal large language models, with recent work including the DeFI framework for disentangled robot learning and DreamVLA for vision-language-action modeling. He has published at top-tier venues including ICLR, NeurIPS, and ECCV, and is currently a research intern at GalBot.
“DeFI achieves an average task length of 4.51 on CALVIN ABC-D (the hardest generalization split, training on ABC, testing on unseen environment D), beating the prior SOTA VPP at 4.33 (+4.2%), Physical Intelligence's π0 at 3.84, and GR00T N1 at 4.01.”
Source→“This decoupled pretraining paradigm unleashes the potential of massive action-free videos for policy learning, while retaining robot-specific action grounding.”
Source→AI-extracted from podcast / newsletter / paper summaries. May contain errors.