Workshop on Semantic Reasoning and Goal Understanding in Robotics

RSS 2024 - July 19

Semantic understanding of the world is essential for robots to make safe and informed decisions, to adapt to changing environmental conditions, and to enable efficient interactions with other agents. In pursuit of semantic understanding, agents must be able to (i) interpret and represent high-level goals, agnostic of their physical morphology and despite irrelevant aspects of their environments; they must be able to (ii) reason, i.e., to extract abstract concepts from observations in the real-world, logically manipulate these concepts, then leverage the results for inference on downstream tasks; and they must be able to (iii) execute morphology-, environment-, and socially-appropriate behaviors towards those high-level goals.

Despite substantial recent advancements in the use of pre-trained, large-capacity models (i.e., foundation models) for difficult robotics problems, methods still struggle in the face of several practical challenges that relate to real-world deployment, e.g., cross-domain generalization, adaptation to dynamic and human-shared environments, and lifelong operation in open-world contexts. This workshop intends to sponsor discussion of new hybrid methodologies—those that combine representations from foundation models with modeling mechanisms that may prove useful for semantic reasoning and abstract goal understanding, including neural memory mechanisms, procedural modules (e.g., cognitive architectures), neuro-symbolic representations (e.g., knowledge/scene graph embeddings), chain-of-thought reasoning mechanisms, robot skill primitives and their composition, 3D scene representations (e.g., NeRFs), etc.

Intended audience. We aim to bring together engineers, researchers, and practitioners from different communities to enable avenues for interdisciplinary research on methods that could facilitate the deployment of semantics-aware and generalizable embodied agents in unstructured and dynamic real world environments. In addition to the organizers, the presenters, panelists, and technical program committee are drawn from the following (sub-)communities: Robot Learning, Embodied AI, Planning + Controls, Cognitive Robotics, Neuro-Symbolism, Natural Language Processing, Computer Vision, and Multimodal Machine Learning. We likewise intend to attract an audience from these diverse sub-communities to contribute to compelling discussions.

Speakers and Panelists

Joyce Chai

University of Michigan

Niko Sünderhauf

QUT Centre for Robotics

Yonatan Bisk

Carnegie Mellon University

Alessandra Sciutti

Istituto Italiano di Tecnologia

Dongheui Lee

Technische Universität Wien

Coline Devin

Google DeepMind

Animesh Garg

Georgia Tech

Jesse Thomason

University of Southern California


13:30 Organizers
Introductory Remarks
13:40 Keynote 1: Niko Sünderhauf
14:00 Keynote 2: Coline Devin
14:20 Keynote 3: Animesh Garg
14:40 Spotlight Talks
15:10 Keynote 4: Alessandra Sciutti
15:30 Coffee Break, Socializing, Posters
16:00 Keynote 5: Joyce Chai
16:20 Keynote 6: Jesse Thomason
16:40 Debate: Implicit/Data-emergent Reasoning Capabilities versus Explicit Reasoning Mechanisms?
Panelists: Animesh Garg, Dongheui Lee, Coline Devin, Yonatan Bisk, Niko Sünderhauf
17:50 Organizers
Concluding Remarks
18:00 Organizers
Workshop Conclusion

Call for Papers
Targeted Topics
In addition to the RSS 2024 subject areas, we especially invite paper submissions on various topics, including (but not limited to):

  • Learning semantically-rich and generalizable robot state representations
  • Learning general goal representations, e.g., in instruction-following
  • Reasoning mechanisms for generalization in open-vocabulary contexts
  • Leveraging foundation models for robotics tasks; efforts to create robotics-specific foundation models
  • Foundation model agent frameworks, e.g., for chain-of-thought reasoning, self-guidance, reasoning about failures, policy-refinement, etc.
  • Multimodal tokenization and prompt mechanisms with foundational models for robotics tasks
  • Grounding foundation models with other modalities (e.g., haptics, audio, IMU signals, joint torques, etc.)
  • Combining foundation models with AI reasoning structures (e.g., neuro-symbolic structures, memory, cognitive architectures, etc.), for robotics tasks
  • Data-efficient concept learning for robotics, e.g., few-shot demonstrations, interactive perception, co-simulation, etc.
Submission Guidelines
RSS SemRob 2024 suggests 4+N or 8+N paper length formats — i.e., 4 or 8 pages of main content with unlimited additional pages for references, appendices, etc. However, like RSS 2024, we impose no strict page length requirements on submissions; we trust that authors will recognize that respecting reviewers’ time is helpful to the evaluation of their work.

Submissions are handled through CMT:

We will accept the official LaTeX or Word paper templates, provided by RSS 2024.

Our review process will be double-blind, following the RSS 2024 paper submission policy for Science/Systems papers.

All accepted papers will be given oral presentations (lightning talks or spotlight talks) as well as poster presentations. Accepted papers will be made available online on the workshop website as non-archival reports, allowing authors to also submit their works to future conferences or journals. We will have a Best Paper Award ceremony at the workshop event.

Important Dates
  • Submission deadline: 3 June 2024, 23:59 AOE.
  • Author Notifications: 24 June 2024, 23:59 AOE.
  • Camera Ready: 1 July 2024, 23:59 AOE.
  • Workshop: 19 July 2024, 13:30-18:00 CET


Jonathan Francis*

Bosch Center for Artificial Intelligence

Andrew Melnik*

University of Bielefeld

Krishan Rana

QUT Centre for Robotics

Saumya Saxena

Carnegie Mellon University

Hyemin Ahn

Ulsan National Institute of Science and Technology

Jean Oh

Carnegie Mellon University

Contact and Information
Reach out to for any questions.

Feel free to subscribe to our mailing list to stay updated on all workshop news: