Teahose.
SIGN IN
NEW HERE — WHAT TEAHOSE DOES
We read the entire AI & tech firehose — so you don't have to.
PODPodcastsAll-In, No Priors, Acquired…
NEWNewslettersStratechery, Newcomer…
PAPPapersPhysical AI research
PHProduct Huntdaily launches
VCInvestor ScoutSequoia, a16z, Benchmark…
CLAUDE DISTILLS →
7 reads, 30 sec each — free, 6 AM ET.
+ a live graph of the companies, people & themes underneath.
HOME/TOPICS/VISION-LANGUAGE MODELS
// TOPIC

Vision-Language Models

ITEMS 90ACROSS PODCASTS · NEWSLETTERS · PAPERS
// TAGGED
90 ITEMS
01
PAPRARXIV PHYSICAL AIJUN 17, 2026
Object-Centric Residual RL for Zero-Shot Sim-to-Real VLA Enhancement
KINAM KIM, YASUYUKI MATSUSHITA, ET AL. (ARXIV PHYSICAL AI)
02
PODAI + A16ZJUN 15, 2026
Ideogram’s Open-Weights Image Model and the Future of AI Design
JUSTINE MOORE, MOHAMMAD NOROUZI, YOKO LI
03
PODTHE A16Z SHOWJUN 15, 2026
AI, Design, and the Power of Open Models
A16Z PODCAST HOST, JUSTINE MOORE, MOHAMMAD NOROUZI, YOKO LI
04
PAPRARXIV PHYSICAL AIJUN 15, 2026
What Matters in Orchestrating Robot Policies: A Systematic Study of Hierarchical VLA Agents
JIAHENG HU, ANNIE XIE, ET AL. (ARXIV PHYSICAL AI)
05
PAPRARXIV PHYSICAL AIJUN 12, 2026
Hy-Embodied-0.5-VLA: From Vision-Language-Action Models to a Real-World Robot Learning Stack
HE ZHANG, ZHENGYOU ZHANG, ET AL. (ARXIV PHYSICAL AI)
06
PAPRARXIV PHYSICAL AIJUN 11, 2026
Improving Robotic Generalist Policies via Flow Reversal Steering
ANDY TANG, SERGEY LEVINE, ET AL. (ARXIV PHYSICAL AI)
07
PAPRARXIV PHYSICAL AIJUN 10, 2026
CHORUS: Decentralized Multi-Embodiment Collaboration with One VLA Policy
RIA DOSHI, JEANNETTE BOHG, ET AL. (ARXIV PHYSICAL AI)
08
PAPRARXIV PHYSICAL AIJUN 5, 2026
LARA: Latent Action Representation Alignment for Vision-Language-Action Models
MENGYA LIU, SIYUAN HUANG, ET AL. (ARXIV PHYSICAL AI)
09
NEWSAXIOS AI+JUN 4, 2026
🌈 GLAAD's AI warning
AXIOS AI+
10
PAPRARXIV PHYSICAL AIJUN 4, 2026
OneVLA: A Unified Framework for Embodied Tasks
LINGFENG ZHANG, WENBO DING, ET AL. (ARXIV PHYSICAL AI)
11
PAPRARXIV PHYSICAL AIJUN 4, 2026
ProGAL-VLA: Grounded Alignment through Prospective Reasoning in Vision-Language-Action Models
NASTARAN DARABI, A. TRIVEDI
12
PAPRARXIV PHYSICAL AIJUN 3, 2026
FlowPRO: Reward-Free Reinforced Fine-Tuning of Flow-Matching VLAs via Proximalized Preference Optimization
YIHAO WU, ZHENGYOU ZHANG, ET AL. (ARXIV PHYSICAL AI)
13
PAPRARXIV PHYSICAL AIJUN 3, 2026
GRAIL: Generating Humanoid Loco-Manipulation from 3D Assets and Video Priors
TIANYI XIE, YE YUAN, ET AL. (ARXIV PHYSICAL AI)
14
PODTRAINING DATAJUN 2, 2026
Knowing what your customers want, all the time: Listen Labs' Alfred Wahlforss
ALFRED WAHLFORSS, KONSTANTIN
15
PAPRARXIV PHYSICAL AIJUN 2, 2026
ElegantVLA: Learning When to Think for Efficient Vision-Language-Action Models
YE LI, ZHI WANG, ET AL. (ARXIV PHYSICAL AI)
16
PAPRARXIV PHYSICAL AIJUN 1, 2026
Colosseum V2: Benchmarking Generalization for Vision Language Action Models
JEREMY MORGAN, ISHIKA SINGH, ET AL. (ARXIV PHYSICAL AI)
17
PAPRARXIV PHYSICAL AIJUN 1, 2026
VLA-Pro: Cross-Task Procedural Memory Transfer for Vision-Language-Action Models
SHENGYUN SI, YU-GANG JIANG, ET AL. (ARXIV PHYSICAL AI)
18
PAPRARXIV PHYSICAL AIMAY 28, 2026
EXPO-FT: Sample-Efficient Reinforcement Learning Finetuning for Vision-Language-Action Models
PERRY DONG, CHELSEA FINN, ET AL. (ARXIV PHYSICAL AI)
19
PAPRARXIV PHYSICAL AIMAY 28, 2026
FineVLA: Fine-Grained Instruction Alignment for Steerable Vision-Language-Action Policies
XINTONG HU, TAO YU, ET AL. (ARXIV PHYSICAL AI)
20
PAPRARXIV PHYSICAL AIMAY 28, 2026
Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments
QIUYUE WANG, XIONGHUI CHEN, ET AL. (ARXIV PHYSICAL AI)
21
PAPRARXIV PHYSICAL AIMAY 28, 2026
VisualThink-VLA: Visual Intermediate Reasoning for Effective and Low-Latency Vision-Language-Action Policies
MINGJIAN GAO, YUETING ZHUANG, ET AL. (ARXIV PHYSICAL AI)
22
POD张小珺JÙN|商业访谈录MAY 28, 2026
143. 对何小鹏的第二次访谈:更大赌注、人形机器人Iron诞生、那场意外、技术剧变下CEO、GX和缝合怪
何小鹏, 小俊
23
PAPRARXIV PHYSICAL AIMAY 22, 2026
GuidedVLA: Specifying Task-Relevant Factors via Plug-and-Play Action Attention Specialization
XIAOSONG JIA, YU-GANG JIANG, ET AL. (ARXIV PHYSICAL AI)
24
PAPRARXIV PHYSICAL AIMAY 22, 2026
Towards Long-horizon Embodied Agents with Tool-Aligned Vision-Language-Action Models
ZIXING LEI, SIHENG CHEN, ET AL. (ARXIV PHYSICAL AI)
25
PAPRARXIV PHYSICAL AIMAY 21, 2026
Imagine2Real: Towards Zero-shot Humanoid-Object Interaction via Video Generative Priors
JIAHE CHEN, JINGBO WANG, ET AL. (ARXIV PHYSICAL AI)
26
PAPRARXIV PHYSICAL AIMAY 21, 2026
Judge, Then Drive: A Critic-Centric Vision Language Action Framework for Autonomous Driving
LIJIN YANG, HAO YANG, ET AL. (ARXIV PHYSICAL AI)
27
PAPRARXIV PHYSICAL AIMAY 21, 2026
PokeVLA: Empowering Pocket-Sized Vision-Language-Action Model with Comprehensive World Knowledge Guidance
YUPENG ZHENG, WENCHAO DING, ET AL. (ARXIV PHYSICAL AI)
28
PAPRARXIV PHYSICAL AIMAY 20, 2026
PointACT: Vision-Language-Action Models with Multi-Scale Point-Action Interaction
SHIZHE CHEN, PAUL PACAUD, CORDELIA SCHMID
29
PAPRARXIV PHYSICAL AIMAY 18, 2026
MolmoAct2: Action Reasoning Models for Real-world Deployment
HAOQUAN FANG, RANJAY KRISHNA, ET AL. (ARXIV PHYSICAL AI)
30
PAPRARXIV PHYSICAL AIMAY 18, 2026
StableVLA: Towards Robust Vision-Language-Action Models without Extra Data
YIYANG FU, DAQUAN ZHOU, ET AL. (ARXIV PHYSICAL AI)
31
POD晚点聊 LATETALKMAY 18, 2026
165: 英伟达 GEAR 高深远:世界模型、自进化循环、DreamDojo
MANCHI, 晚点团队
32
PAPRARXIV PHYSICAL AIMAY 17, 2026
Do World Action Models Generalize Better than VLAs? A Robustness Study
ZHANGUANG ZHANG, YINGXUE ZHANG, ET AL. (ARXIV PHYSICAL AI)
33
PAPRARXIV PHYSICAL AIMAY 17, 2026
VLA-ATTC: Adaptive Test-Time Compute for VLA Models with Relative Action Critic Model
WENHAO LI, CHANG XU, ET AL. (ARXIV PHYSICAL AI)
34
PAPRARXIV PHYSICAL AIMAY 13, 2026
Realtime-VLA FLASH: Speculative Inference Framework for Diffusion-based VLAs
JIAHUI NIU, HUAWEI LI, ET AL. (ARXIV PHYSICAL AI)
35
PAPRARXIV PHYSICAL AIMAY 5, 2026
TiPToP: A Modular Open-Vocabulary Planning System for Robotic Manipulation
WILLIAM SHEN, TOM'AS LOZANO-P'EREZ, ET AL. (ARXIV PHYSICAL AI)
36
PAPRARXIV PHYSICAL AIMAY 4, 2026
$\Delta$VLA: Prior-Guided Vision-Language-Action Models via World Knowledge Variation
YIJIE ZHU, ZITONG YU, ET AL. (ARXIV PHYSICAL AI)
37
PAPRARXIV PHYSICAL AIAPR 30, 2026
LaST-R1: Reinforcing Action via Adaptive Physical Latent Reasoning for VLA Models
HAO CHEN, PHENG-ANN HENG, ET AL. (ARXIV PHYSICAL AI)
38
PAPRARXIV PHYSICAL AIAPR 29, 2026
Unified 4D World Action Modeling from Video Priors with Asynchronous Denoising
JUN GUO, HUAPING LIU, ET AL. (ARXIV PHYSICAL AI)
39
PAPRARXIV PHYSICAL AIAPR 24, 2026
RedVLA: Physical Red Teaming for Vision-Language-Action Models
YUHAO ZHANG, JIAMING JI, ET AL. (ARXIV PHYSICAL AI)
40
PAPRARXIV PHYSICAL AIAPR 21, 2026
EmbodiedMidtrain: Bridging the Gap between Vision-Language Models and Vision-Language-Action Models via Mid-training
YIYANG DU, CHENYAN XIONG, ET AL. (ARXIV PHYSICAL AI)
41
PAPRARXIV PHYSICAL AIAPR 21, 2026
UniT: Toward a Unified Physical Language for Human-to-Humanoid Policy Learning and World Modeling
BOYU CHEN, YIXIAO GE, ET AL. (ARXIV PHYSICAL AI)
42
PAPRARXIV PHYSICAL AIAPR 17, 2026
Observing and Controlling Features in Vision-Language-Action Models
HUGO BUURMEIJER, MARCO PAVONE, ET AL. (ARXIV PHYSICAL AI)
43
PAPRARXIV PHYSICAL AIAPR 17, 2026
VP-VLA: Visual Prompting as an Interface for Vision-Language-Action Models
ZIXUAN WANG, JIAYA JIA, ET AL. (ARXIV PHYSICAL AI)
44
PAPRARXIV PHYSICAL AIAPR 16, 2026
R3D: Revisiting 3D Policy Learning
ZHENGDONG HONG, JIAYUAN GU, ET AL. (ARXIV PHYSICAL AI)
45
PAPRARXIV PHYSICAL AIAPR 16, 2026
Vision-Based Safe Human-Robot Collaboration with Uncertainty Guarantees
JAKOB THUMM, MARCO PAVONE, ET AL. (ARXIV PHYSICAL AI)
46
PAPRARXIV PHYSICAL AIAPR 13, 2026
ViserDex: Visual Sim-to-Real for Robust Dexterous In-hand Reorientation
ARJUN BHARDWAJ, MARCO HUTTER, ET AL. (ARXIV PHYSICAL AI)
47
POD张小珺JÙN|商业访谈录APR 10, 2026
135. 和自然选择创始人Tristan聊,Elys、赛博分身、灵魂、Context的获取与流动和AI社交网络
PRODUCER, TRISTAN (张筱帆), 张小珺
48
PAPRARXIV PHYSICAL AIAPR 9, 2026
HEX: Humanoid-Aligned Experts for Cross-Embodiment Whole-Body Manipulation
SHUANGHAO BAI, BADONG CHEN, ET AL. (ARXIV PHYSICAL AI)
49
PAPRARXIV PHYSICAL AIAPR 9, 2026
LAMP: Lift Image-Editing as General 3D Priors for Open-world Manipulation
JINGJING WANG, GUOFENG ZHANG, ET AL. (ARXIV PHYSICAL AI)
50
PAPRARXIV PHYSICAL AIAPR 7, 2026
Action Images: End-to-End Policy Learning via Multiview Video Generation
HAOYU ZHEN, CHUANG GAN, ET AL. (ARXIV PHYSICAL AI)
51
PAPRARXIV PHYSICAL AIAPR 7, 2026
SnapFlow: One-Step Action Generation for Flow-Matching VLAs via Progressive Self-Distillation
WUYANG LUAN, RUI MA, ET AL. (ARXIV PHYSICAL AI)
52
PAPRARXIV PHYSICAL AIAPR 6, 2026
DySL-VLA: Efficient Vision-Language-Action Model Inference via Dynamic-Static Layer-Skipping for Robot Manipulation
ZEBIN YANG, MENG LI, ET AL. (ARXIV PHYSICAL AI)
53
PAPRARXIV PHYSICAL AIAPR 6, 2026
Large Reward Models: Generalizable Online Robot Reward Generation with Vision-Language Models
YANRU WU, YUE WANG, ET AL. (ARXIV PHYSICAL AI)
54
NEWSTHE AI CORNERAPR 5, 2026
OpenAI’s Next Image Model Just Leaked. The Examples Are Insane.
THE AI CORNER
55
PAPRARXIV PHYSICAL AIAPR 5, 2026
Adaptive Action Chunking at Inference-time for Vision-Language-Action Models
YUANCHANG LIANG, PRAHLAD VADAKKEPAT, ET AL. (ARXIV PHYSICAL AI)
56
PAPRARXIV PHYSICAL AIAPR 5, 2026
MobileManiBench: Simplifying Model Verification for Mobile Manipulation
WENBO WANG, BAINING GUO, ET AL. (ARXIV PHYSICAL AI)
57
PAPRARXIV PHYSICAL AIAPR 5, 2026
Not All Features Are Created Equal: A Mechanistic Study of Vision-Language-Action Models
BRYCE GRANT, XIJIA ZHAO, PENG WANG
58
PAPRARXIV PHYSICAL AIAPR 4, 2026
ABot-M0: VLA Foundation Model for Robotic Manipulation with Action Manifold Learning
YANDAN YANG, MU XU, ET AL. (ARXIV PHYSICAL AI)
59
PAPRARXIV PHYSICAL AIAPR 2, 2026
ForeAct: Steering Your VLA with Efficient Visual Foresight Planning
ZHUOYANG ZHANG, SONG HAN, ET AL. (ARXIV PHYSICAL AI)
60
PAPRARXIV PHYSICAL AIAPR 2, 2026
Xiaomi-Robotics-0: An Open-Sourced Vision-Language-Action Model with Real-Time Execution
RUISI CAI, QUAN ZHOU, ET AL. (ARXIV PHYSICAL AI)
61
POD晚点聊 LATETALKAPR 2, 2026
157: 具身季报26Q1:人形再思考、英伟达世界模型、高自由度灵巧手
MANCHI, 晚点团队
62
PAPRARXIV PHYSICAL AIAPR 1, 2026
Learning Humanoid Navigation from Human Data
WEIZHUO WANG, MONROE KENNEDY, ET AL. (ARXIV PHYSICAL AI)
63
PAPRARXIV PHYSICAL AIMAR 31, 2026
DIAL: Decoupling Intent and Action via Latent World Modeling for End-to-End VLA
YI CHEN, XIHUI LIU, ET AL. (ARXIV PHYSICAL AI)
64
PAPRARXIV PHYSICAL AIMAR 30, 2026
FocusVLA: Focused Visual Utilization for Vision-Language-Action Models
YICHI ZHANG, JIA WAN, ET AL. (ARXIV PHYSICAL AI)
65
PAPRARXIV PHYSICAL AIMAR 30, 2026
OmniGuide: Universal Guidance Fields for Enhancing Generalist Robot Policies
YUNZHOU SONG, KOSTAS DANIILIDIS, ET AL. (ARXIV PHYSICAL AI)
66
PAPRARXIV PHYSICAL AIMAR 30, 2026
SOLE-R1: Video-Language Reasoning as the Sole Reward for On-Robot Reinforcement Learning
PHILIP SCHROEDER, ONDREJ BIZA, ET AL. (ARXIV PHYSICAL AI)
67
PAPRARXIV PHYSICAL AIMAR 29, 2026
Rethinking Visual-Language-Action Model Scaling: Alignment, Mixture, and Regularization
YE WANG, QIN JIN, ET AL. (ARXIV PHYSICAL AI)
68
PAPRARXIV PHYSICAL AIMAR 29, 2026
ST4VLA: Spatially Guided Training for Vision-Language-Action Models
JI-LU YE, JIANGMIAO PANG, ET AL. (ARXIV PHYSICAL AI)
69
PAPRARXIV PHYSICAL AIMAR 28, 2026
VLAW: Iterative Co-Improvement of Vision-Language-Action Policy and World Model
YANJIANG GUO, CHELSEA FINN, ET AL. (ARXIV PHYSICAL AI)
70
PAPRMIT RESEARCHMAR 25, 2026
Open-World Task and Motion Planning via Vision-Language Model Generated Constraints
NISHANTH KUMAR, CAELAN REED GARRETT, ET AL. (MIT)
71
PAPRARXIV PHYSICAL AIMAR 25, 2026
Steerable Vision-Language-Action Policies for Embodied Reasoning and Hierarchical Control
WILLIAM CHEN, SERGEY LEVINE, ET AL. (ARXIV PHYSICAL AI)
72
PAPRBEIJING ACADEMY OF ARTIFICIAL INTELLIGENCE (BAAI) RESEARCHMAR 23, 2026
Being-0: A Humanoid Robotic Agent with Vision-Language Models and Modular Skills
HAOQI YUAN, ZONGQING LU, ET AL. (BEIJING ACADEMY OF ARTIFICIAL INTELLIGENCE (BAAI))
73
PAPRARXIV PHYSICAL AIMAR 23, 2026
DualCoT-VLA: Visual-Linguistic Chain of Thought via Parallel Reasoning for Vision-Language-Action Models
ZHIDE ZHONG, HAOANG LI, ET AL. (ARXIV PHYSICAL AI)
74
PAPRARXIV PHYSICAL AIMAR 23, 2026
WholeBodyVLA: Towards Unified Latent VLA for Whole-Body Loco-Manipulation Control
HAORAN JIANG, HONGYANG LI, ET AL. (ARXIV PHYSICAL AI)
75
PAPRARXIV PHYSICAL AIMAR 22, 2026
DreamDojo: A Generalist Robot World Model from Large-Scale Human Videos
SHENYUAN GAO, LINXIJIMFAN, ET AL. (ARXIV PHYSICAL AI)
76
PODDWARKESHMAR 20, 2026
Terence Tao – Kepler, Newton, and the true nature of mathematical discovery
DWARKESH PATEL, TERENCE TAO
77
PAPRMIT RESEARCHMAR 19, 2026
Scaling Sim-to-Real Reinforcement Learning for Robot VLAs with Generative 3D Worlds
ANDREW CHOI, WEI XU, ET AL. (MIT)
78
PAPRMIT RESEARCHMAR 19, 2026
Sparse Autoencoders Reveal Interpretable and Steerable Features in VLA Models
AIDEN SWANN, MAC SCHWAGER, ET AL. (MIT)
79
POD张小珺JÙN|商业访谈录MAR 16, 2026
133. 对谢赛宁的7小时马拉松访谈:世界模型、逃出硅谷、AMI Labs、两次拒绝Ilya、杨立昆、李飞飞和42
ZHANG XIAOJUN, 张小珺
80
PODSTANFORD AI SPEAKER SERIESNOV 23, 2025
Stanford AI Club: Jeff Dean on Important AI Trends
JEFF DEAN
81
PODLENNY'SNOV 16, 2025
The Godmother of AI on jobs, robots & why world models are next | Dr. Fei-Fei Li
LENNY (HOST), DR. FEI-FEI LI
82
PODTRAINING DATANOV 11, 2025
How Google’s Nano Banana Achieved Breakthrough Character Consistency
NICOLE BRITOVA, HANSA SRINIVASAN
83
PODTRAINING DATANOV 7, 2025
OpenAI Sora 2 Team: How Generative Video Will Unlock Creativity and World Models
ROHAN SAHAI, THOMAS DIMSON, BILL, THOMAS DIMSON
84
PODZHANG XIAOYUN'SOCT 31, 2025
Interview with Li Xiang (Part 2): CEO Large Model, MoE, Liang Wenfeng, VLA, Energy, Memory, Confronting Human Nature, Intimate Relationships, Human Wisdom
LI XIANG
85
POD张小珺JÙN|商业访谈录 (ZHANG XIAOJUN BUSINESS INTERVIEWS)OCT 30, 2025
Interview with Li Xiang Part 2: CEO Large Model, MoE, Liang Wenfeng, VLA, Energy, Memory, Confronting Human Nature, Intimate Relationships, Human Wisdom
LI XIANG, TIAN ZHEN XIAO JUN
86
POD晚点聊 LATETALK (INVESTIGATIVE JOURNALISM)OCT 23, 2025
138: From the moment you use your mobile phone to when it understands you better, OPPO's mobile phone AI practice | Conversation with Wan Yulong, head of Xiaobu
WAN YULONG, MANQI
87
PODONE-OFF EPISODESOCT 21, 2025
#38 Karol Hausman & Kevin Black: Building A Brain For Any Robot | AI Eating The Physical World
UNKNOWN HOST, KEVIN BLACK, CARL HAUSMAN
88
PODONE-OFF EPISODESSEP 12, 2025
Fully autonomous robots are much closer than you think – Sergey Levine
SERGEY LEVINE, DWARKESH PATEL
89
PODONE-OFF EPISODESJUN 7, 2025
【BAAI2025】 Building Physical Intelligence | Karol Hausman
KAROL HAUSMAN
90
PODONE-OFF EPISODESAPR 26, 2025
The Race to Create General-Purpose Robots | Karol Hausman & Lachy Groom on TBPN
HOST, KAROL HAUSMAN, LACHY GROOM
Vision-Language Models: Podcast & Newsletter Summaries | Teahose