2nd Workshop on Semantic Reasoning and Goal Understanding in Robotics (SemRob)
Robotics Science and Systems Conference (RSS 2025)
June 21 - Los Angeles, USA
Time | ||
---|---|---|
08:30 | Organizers Introductory Remarks |
|
08:40 | Keynote 1: Jesse Thomason Embracing Language as Grounded Communication Abstract
Language is not text data, it is a human medium for communication. The larger part of the natural language processing (NLP) community has doubled down on treating digital text as a sufficient approximation of language, scaling datasets and corresponding models to fit that text. I have argued that experience in the world grounds language, tying it to objects, actions, and concepts. In fact, I believe that language carries meaning only when considered alongside that world, and that the zeitgeist in NLP research currently misses the mark on truly interesting questions at the intersection of human language and machine computation. In this talk, I’ll highlight some of the ways my lab enables agents and robots to better understand and respond to human communication by considering the grounded context in which that communication occurs, including neurosymbolic multimodal reasoning, natural language dialogue and interaction for lifelong learning, and utilizing NLP technologies on non-text communication.
|
|
09:00 | Keynote 2: Yonatan Bisk Semantics? Reasoning? Can define either of those terms? Abstract
In this talk I'll discuss some recent work on language conditioned robotics, but I might also choose to spend time questioning the basic assumptions of all of our work, and if we're all misguided about the important questions in robotics.
|
|
09:20 | Keynote 3: Dorsa Sadigh Human-Aligned Robot Learning: manipulation policies via preferences, RLHF, and VLM feedback Abstract
Abstract TBD
|
|
09:40 | Spotlight Talks. | |
10:00 | Coffee Break, Socializing, Posters | |
10:40 | Keynote 4: Ted Xiao Full-stack Robotics Foundation Models: From Embodied Reasoning to Dexterity Abstract
Abstract TBD
|
|
11:00 | Keynote 5: Manolis Savva Towards Realistic & Interactive 3D Simulation for Embodied AI Abstract
3D simulators are increasingly being used to develop and evaluate "embodied AI" (agents perceiving and acting in realistic environments). Much of the prior work in this space has treated simulators as "black boxes" within which learning algorithms are to be deployed. However, the system characteristics of the simulation platforms themselves and the datasets that are used with these platforms both greatly impact the feasibility and the outcomes of experiments involving simulation. In this talk, I will describe several recent projects that outline emerging challenges and opportunities in the development of 3D simulation for embodied AI.
Bio: Manolis Savva is an Associate Professor at Simon Fraser University, and a Canada Research Chair in Computer Graphics. His research focuses on analysis, organization and generation of 3D content. Prior to his current position he was a visiting researcher at Facebook AI Research, and a postdoctoral researcher at Princeton University. He received his Ph.D. from Stanford University under the supervision of Pat Hanrahan. His work has been recognized through several awards including an ACM UIST notable paper award (ReVision), an ICCV best paper nomination (Habitat), two SGP dataset awards (ShapeNet, SGP 2018; ScanNet, SGP 2020), the 2022 Graphics Interface early career researcher award, and an ICLR 2023 outstanding paper award (Emergence of Maps).
|
|
11:20 | Keynote 6: Lerrel Pinto On Building General-Purpose Home Robots Abstract
The concept of a "generalist machine" in homes — a domestic assistant that can adapt and learn from our needs, all while remaining cost-effective — has long been a goal in robotics that has been steadily pursued for decades. In this talk, I will present our recent efforts towards building such capable home robots. First, I will discuss how large, pretrained vision-language models can induce strong priors for mobile manipulation tasks like pick-and-drop. But pretrained models can only take us so far. To scale beyond basic picking, we will need systems and algorithms to rapidly learn new skills. This requires creating new tools to collect data, improving representations of the visual world, and enabling trial-and-error learning during deployment. While much of the work presented focuses on two-fingered hands, I will briefly introduce learning approaches for multi-fingered hands which support more dexterous behaviors and rich touch sensing combined with vision. Finally, I will outline unsolved problems that were not obvious initially, which, when solved, will bring us closer to general-purpose home robots.
|
|
11:40 | Debate: Implicit/Data-emergent Reasoning Capabilities versus Explicit Reasoning Mechanisms? Panelists: Jesse Thomason, Dorsa Sadigh, Ted Xiao, Manolis Savva, Lerrel Pinto, Yonatan Bisk |
|
12:30 | Organizers Closing Remarks |
(Required acknowledgement: the Microsoft CMT service was used for managing the peer-reviewing process for this conference. This service was provided for free by Microsoft and they bore all expenses, including costs for Azure cloud services as well as for software development and support.)