Skip to main content

Johns Hopkins UniversityEst. 1876

America’s First Research University

LCSR Seminar: Toward Robotics Foundation Models that Can Reason

Sept 17, 2025
12 - 1pm EDT
This event is free

Who can attend?

  • General public
  • Faculty
  • Staff
  • Students

Contact

Laboratory for Computational Sensing and Robotics, Whiting School of Engineering
410-516-6841

Description

Jiafei Duan, a doctoral candidate in computer science and engineering at the University of Washington, will give a talk titled "Toward Robotics Foundation Models that Can Reason" for the Laboratory for Computational Sensing + Robotics.

Abstract:

Generative artificial intelligence has advanced rapidly in language and vision, driven by massive image–text datasets and scaled models. These gains are increasingly enabling robots with open-world perception and reasoning. Yet progress toward generalist robots is constrained by the scarcity of large-scale, high-quality interaction data, which limits real-world generalization and action-level reasoning. While MLLM-based systems show promise especially for acquiring low-level skills in everyday settings. My research focuses on moving beyond data scaling alone by formalizing and leveraging reasoning as a core principle for building truly generalist robotic models. In this talk, I will present three recent works that aim to bridge the gap between rich semantic world knowledge in MLLMs and actionable robot control. I will begin with AHA, a vision-language model that reasons about failures in robotic manipulation and improves the robustness of existing systems. Building on this, I will introduce SAM2Act, a 3D generalist robotic model with a memory-centric architecture capable of performing high-precision manipulation tasks while retaining and reasoning over past observations. Finally, I will present MolmoAct, AI2's flagship robotic foundation model for spatial reasoning, designed as a generalist system that can be post-trained for a wide range of downstream manipulation tasks.

Jiafei Duan's research focuses on foundation models for robotics, with an emphasis on developing scalable data collection and generation methods, grounding vision-language models in robotic reasoning, and advancing robust generalization in robot learning. His work has been featured in the MIT Technology Review, GeekWire, VentureBeat, and Business Wire. Duan's research has been published in top AI and robotics venues, including ICLR, ICML, RSS, CoRL, ECCV, IJCAI, CoLM, and EMNLP, and has earned awards such as Best Paper at Ubiquitous Robots 2023 and a Spotlight at ICLR 2024. He is a recipient of both the ASTAR National Science PhD Scholarship and the ASTAR Undergraduate Scholarship.

Who can attend?

  • General public
  • Faculty
  • Staff
  • Students

Contact

Laboratory for Computational Sensing and Robotics, Whiting School of Engineering
410-516-6841