1st Workshop on Semantic Reasoning and Goal Understanding in Robotics (SemRob)

Robotics Science and Systems Conference (RSS 2024) - July 19 - Delft, Netherlands


About
Semantic understanding of the world is essential for robots to make safe and informed decisions, to adapt to changing environmental conditions, and to enable efficient interactions with other agents. In pursuit of semantic understanding, agents must be able to (i) interpret and represent high-level goals, agnostic of their physical morphology and despite irrelevant aspects of their environments; they must be able to (ii) reason, i.e., to extract abstract concepts from observations in the real-world, logically manipulate these concepts, then leverage the results for inference on downstream tasks; and they must be able to (iii) execute morphology-, environment-, and socially-appropriate behaviors towards those high-level goals.

Despite substantial recent advancements in the use of pre-trained, large-capacity models (i.e., foundation models) for difficult robotics problems, methods still struggle in the face of several practical challenges that relate to real-world deployment, e.g., cross-domain generalization, adaptation to dynamic and human-shared environments, and lifelong operation in open-world contexts. This workshop intends to sponsor discussion of new hybrid methodologies—those that combine representations from foundation models with modeling mechanisms that may prove useful for semantic reasoning and abstract goal understanding, including neural memory mechanisms, procedural modules (e.g., cognitive architectures), neuro-symbolic representations (e.g., knowledge/scene graph embeddings), chain-of-thought reasoning mechanisms, robot skill primitives and their composition, 3D scene representations (e.g., NeRFs), etc.

Intended audience. We aim to bring together engineers, researchers, and practitioners from different communities to enable avenues for interdisciplinary research on methods that could facilitate the deployment of semantics-aware and generalizable embodied agents in unstructured and dynamic real world environments. In addition to the organizers, the presenters, panelists, and technical program committee are drawn from the following (sub-)communities: Robot Learning, Embodied AI, Planning + Controls, Cognitive Robotics, Neuro-Symbolism, Natural Language Processing, Computer Vision, and Multimodal Machine Learning. We likewise intend to attract an audience from these diverse sub-communities to contribute to compelling discussions.
Event Information This is a primarily in-person workshop, held at the 2024 Robotics Science and Systems conference (RSS), in Delft, Netherlands on 19 July 2024, starting at 13:30 CET.

The room location is Lecture Hall D, in the Aula Conference Centre, at TU Delft. You might find the travel information on the RSS 2024 website helpful.

Poster session location: outside the lower right-side doors of Lecture Hall D.

RSS SemRob 2024 is Workshop #28, on the official RSS schedule.


Schedule

Time
13:30 Organizers
Introductory Remarks
13:40 Keynote 1: Niko Sünderhauf
Semantics and Understanding for Better Perception, Representation, and Actions
14:00 Keynote 2: Alessandra Sciutti
Toward Artificial Cognition
14:20 Keynote 3: Coline Devin
Free lunch? Revisiting tradeoffs in goal-conditioned policy learning in the foundation model era
14:40 Spotlight Talks. IDs: 4,5,26,27
15:10 Keynote 4: Animesh Garg
A Perspective on Prospection for Robot Autonomy
15:30 Coffee Break, Socializing, Posters
16:20 Keynote 5: Joyce Chai
LLMs for Navigation and Grounding in Cognitive Robots
16:40 Debate: Implicit/Data-emergent Reasoning Capabilities versus Explicit Reasoning Mechanisms?
Panelists: Animesh Garg, Coline Devin, Yonatan Bisk, Niko Sünderhauf, Yilun Du
17:50 Organizers
Closing Remarks


Call for Papers
Targeted Topics
In addition to the RSS 2024 subject areas, we especially invite paper submissions on various topics, including (but not limited to):

  • Learning semantically-rich and generalizable robot state representations
  • Learning general goal representations, e.g., in instruction-following
  • Reasoning mechanisms for generalization in open-vocabulary contexts
  • Leveraging foundation models for robotics tasks; efforts to create robotics-specific foundation models
  • Foundation model agent frameworks, e.g., for chain-of-thought reasoning, self-guidance, reasoning about failures, policy-refinement, etc.
  • Multimodal tokenization and prompt mechanisms with foundational models for robotics tasks
  • Grounding foundation models with other modalities (e.g., haptics, audio, IMU signals, joint torques, etc.)
  • Combining foundation models with AI reasoning structures (e.g., neuro-symbolic structures, memory, cognitive architectures, etc.), for robotics tasks
  • Data-efficient concept learning for robotics, e.g., few-shot demonstrations, interactive perception, co-simulation, etc.
Submission Guidelines
RSS SemRob 2024 suggests 4+N or 8+N paper length formats — i.e., 4 or 8 pages of main content with unlimited additional pages for references, appendices, etc. However, like RSS 2024, we impose no strict page length requirements on submissions; we trust that authors will recognize that respecting reviewers’ time is helpful to the evaluation of their work.

Submissions are handled through CMT: https://cmt3.research.microsoft.com/SEMROB2024

We will accept the official LaTeX or Word paper templates, provided by RSS 2024.

Our review process will be double-blind, following the RSS 2024 paper submission policy for Science/Systems papers.

All accepted papers will be invited for poster presentations; the highest-rated papers, according to the Technical Program Committee, will be given spotlight presentations. Accepted papers will be made available online on this workshop website as non-archival reports, allowing authors to also submit their works to future conferences or journals. We will highlight the Best Reviewer and reveal the Best Paper Award during the closing remarks at the workshop event.

Important Dates
  • Submission deadline: 10 June 2024, 23:59 AOE.
  • Author Notifications: 24 June 2024, 23:59 AOE.
  • Camera Ready: 1 July 2024, 23:59 AOE.
  • Workshop: 19 July 2024, 13:30-18:00 CET

Accepted Papers

Congratulations to Haoran Geng (SAGE: Bridging Semantic and Actionable Parts for Generalizable Manipulation of Articulated Objects) for winning the Best Paper Award!

  • (Paper ID #2) la VIDA: Towards a Motivated Goal Reasoning Agent [paper]
    Ursula Addison
  • (Paper ID #3) Open6DOR: Benchmarking Open-instruction 6-DoF Object Rearrangement and A VLM-based Approach [paper]
    Yufei Ding, Haoran Geng, Chaoyi Xu, Xiaomeng Fang, Jiazhao Zhang, Songlin Wei, Qiyu Dai, Zhizheng Zhang, He Wang
  • (Paper ID #4) SAGE: Bridging Semantic and Actionable Parts for Generalizable Manipulation of Articulated Objects [paper] [video] (spotlight) (best paper award!)
    Haoran Geng, Songlin Wei, Congyue Deng, Bokui Shen, He Wang, Leonidas Guibas
  • (Paper ID #5) Behavior Generation with Latent Actions [paper] [video] (spotlight)
    Seungjae Lee, Yibin Wang, Haritheja Etukuru, Hyoun Jin Kim, Nur Muhammad (Mahi) Shafiullah, Lerrel Pinto
  • (Paper ID #6) Language models are robotic planners: reframing plans as goal refinement graphs [paper]
    Ateeq Sharfuddin, Travis Breaux
  • (Paper ID #7) OK-Robot: What Really Matters in Integrating Open-Knowledge Models for Robotics [paper]
    Peiqi Liu, yaswanth kumar Orru, Jay Vakil, Christopher Paxton, Nur Muhammad (Mahi) Shafiullah, Lerrel Pinto
  • (Paper ID #9) Clio: Real-time Task-Driven Open-Set 3D Scene Graphs [paper]
    Dominic Maggio, Yun Chang, Nathan Hughes, Matthew Trang, John Griffith, Eric Cristofalo, Carlyn Dougherty, Lukas Schmid, Luca Carlone
  • (Paper ID #10) Opening Cabinets and Drawers in the Real World using a Commodity Mobile Manipulator [paper]
    Arjun Gupta, Michelle Zhang, Rishik Sathua, Saurabh Gupta
  • (Paper ID #12) VLA-3D: A Dataset for 3D Semantic Scene Understanding and Navigation [paper]
    Haochen Zhang, Nader Zantout, Pujith Kachana, Zongyuan Wu, Ji Zhang, Wenshan Wang
  • (Paper ID #14) Enhancing Vision-Language Models with Scene Graphs for Traffic Accident Understanding [paper]
    Aaron Lohner, Francesco Compagno, Jonathan Francis, Alessandro Oltramari
  • (Paper ID #15) Which objects help me to act effectively? Reasoning about physically-grounded affordances [paper]
    Anne Kemmeren, Gertjan Burghouts, Michael van Bekkum, Wouter Meijer, Jelle van Mil
  • (Paper ID #17) RAM: Retrieval-Based Affordance Transfer for Generalizable Zero-Shot Robotic Manipulation [paper]
    Yuxuan Kuang, Junjie Ye, Haoran Geng, Jiageng Mao, Congyue Deng, Leonidas Guibas, He Wang, Yue Wang
  • (Paper ID #19) Dialog-based Skill and Task Learning for Robot [paper]
    Weiwei Gu, N. Suresh K. Kondepudi, Lixiao Huang, Nakul Gopalan
  • (Paper ID #20) Embodied AI Robot Companion for Efficient Object Handling in Bimanual Teleoperation [paper]
    Haolin Fei, Songlin Ma, Bo Xiao, Ziwei Wang
  • (Paper ID #21) Grounding Language Plans in Demonstrations Through Counterfactual Perturbations [paper]
    Yanwei Wang
  • (Paper ID #22) RoboEXP: Action-Conditioned Scene Graph via Interactive Exploration for Robotic Manipulation [paper]
    Hanxiao Jiang, Binghao Huang, Ruihai Wu, Zhuoran Li, Shubham Garg, Hooshang Nayyeri, Shenlong Wang, Yunzhu Li
  • (Paper ID #23) Lang2LTL-2: Grounding Spatiotemporal Navigation Commands Using Large Language and Vision-Language Models [paper]
    Jason Xinyu Liu, Ankit Shah, George Konidaris, Stefanie Tellex, David Paulius
  • (Paper ID #24) CogExplore: Contextual Exploration with Language Encoded Environment Representations [paper]
    Harel Biggie, Patrick Cooper, Doncey Albin, Kristen Such, Christoffer Heckman
  • (Paper ID #26) Language-Grounded Dynamic Scene Graphs for Interactive Object Search with Mobile Manipulation [paper] [video] (spotlight)
    Daniel Honerkamp, Martin Buechner, Fabien Despinoy, Tim Welschehold, Abhinav Valada
  • (Paper ID #27) Natural Language Can Help Bridge the Sim2Real Gap [paper] [video] (spotlight)
    Albert Yu, Adeline Foote, Raymond Mooney, Roberto Martín-Martín

Organizers

Jonathan Francis

Bosch Center for Artificial Intelligence

Andrew Melnik

University of Bielefeld

Krishan Rana

QUT Centre for Robotics

Saumya Saxena

Carnegie Mellon University

Hyemin Ahn

Ulsan National Institute of Science and Technology

Qiang Li

Shenzhen Technology University

Jean Oh

Carnegie Mellon University



Program Committee
  • Ursula Addison
  • Harel Biggie
  • Gertjan Burghouts
  • Nicolas Chapman
  • Bingqing Chen (MR)
  • Jonathan Francis (Ch)
  • Ruihan Gao
  • Nikolaos Gkanatsios
  • Weiwei Gu
  • Binghao Huang
  • Nathan Hughes
  • Hanxiao Jiang
  • Nikhil Keetha
  • Anne Kemmeren
  • Seungchan Kim
  • N. Suresh K. Kondepudi
  • Seungjae Lee
  • Tabitha Lee (TR)
  • Qiang Li (ER)
  • Zhixuan Liu
  • Aaron Lohner
  • Xiaopeng Lu
  • Dominic Maggio
  • Nicolas Marticorena Vidal
  • Andrew Melnik
  • Marius Memmel
  • Mark Mints
  • Daniel Omeiza (TR)
  • Alessandro Oltramari
  • Karthik Paga
  • Maithili Patel
  • Sarvesh Patil
  • Andrey Rudenko
  • Ateeq Sharfuddin
  • Roykrong Sukkerd (TR)
  • Shivam Vats
  • Rui Wang (TR)
  • Ho-Hsiang Wu (MR)
  • Yaqi Xie
  • Junjie Ye
  • Jiarui Zhang
  • Yufei Zhu


ERRecognises PC member who served as an Emergency Reviewer.
TRRecognises PC member who, according to Workshop Chairs' ratings, ranked in the top 10% of all Reviewers.
MRRecognises PC member who provided their services as a Meta-Reviewer.
ChPaper Track Chair.


Contact and Information
Direct questions to semrob.workshop+general@gmail.com.

Subscribe to our mailing list to stay updated on workshop news.
-->