Table of Contents
Paper | Base Language Model | Code | Publication | Preprint | Affiliation |
---|---|---|---|---|---|
A Language Agent for Autonomous Driving | GPT-3.5 | code | 2311.10813 | USC | |
RT-2: New model translates vision and language into action | PaLI-X, PaLM-E | blog | Deepmind | ||
VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Language Models | GPT4 | web | 2307.05973 | Stanford | |
Statler: State-Maintaining Language Models for Embodied Reasoning | GPT3 | 2306.17840 | TTIC | ||
Chat with the Environment: Interactive Multimodal Perception using Large Language Models | GPT3 | 2303.08268 | Universitat Hamburg |
- Mobile ALOHA, Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation