Kim et al.
Moo Jin Kim is a final-year Ph.D. candidate in computer science at Stanford University, advised by Chelsea Finn and Percy Liang, with a research affiliation at NVIDIA. He works at the intersection of robotics and large-scale generative models, focusing on end-to-end vision-based robotic manipulation and vision-language-action models. He is best known as the lead author of Cosmos-Policy, a framework that fine-tunes video diffusion models for visuomotor control, achieving state-of-the-art results including 82.2% on LIBERO-Plus, and as a co-author of OpenVLA, an open-source vision-language-action model.
“Cosmos-Policy reaches 82.2% on LIBERO-Plus (single-arm manipulation).”
Source→AI-extracted from podcast / newsletter / paper summaries. May contain errors.