Kefan Gu
Kefan Gu is a researcher at Nanjing University who collaborated with the Shenzhen-based embodied AI startup Dexmal. He is best known as a co-first author of Realtime-VLA FLASH, a speculative inference framework that accelerates diffusion-based Vision-Language-Action models roughly 3x at inference time without retraining the main model, achieved through a lightweight draft model paired with parallel verification by the main model's Action Expert. His broader research focuses on VLA models for robotic control, including work on human intention reasoning and masked diffusion approaches to robotic manipulation.
“FLASH is a software-level framework that makes the same model run 3x faster at inference time, with almost no degradation in task success — no retraining the main model, no new hardware.”
Source→“Kefan Gu — Nanjing University / Dexmal; Co-first author. Contributed to the draft model architecture and verification design.”
Source→AI-extracted from podcast / newsletter / paper summaries. May contain errors.