Skip to content

Commit

Permalink
mbodied agents v2 (#23)
Browse files Browse the repository at this point in the history
  • Loading branch information
nqyy authored Jun 16, 2024
1 parent 7a37ce4 commit 8218d28
Show file tree
Hide file tree
Showing 46 changed files with 906 additions and 2,423 deletions.
35 changes: 25 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,12 @@
[![Ubuntu](https://github.com/MbodiAI/opensource/actions/workflows/ubuntu.yml/badge.svg)](https://github.com/MbodiAI/opensource/actions/workflows/ubuntu.yml)
[![PyPI Version](https://img.shields.io/pypi/v/mbodied-agents.svg)](https://pypi.python.org/pypi/mbodied-agents)
[![Documentation Status](https://readthedocs.com/projects/mbodi-ai-mbodied-agents/badge/?version=latest)](https://mbodi-ai-mbodied-agents.readthedocs-hosted.com/en/latest/?badge=latest)
[![Example Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1DAQkuuEYj8demiuJS1_10FIyTI78Yzh4?usp=sharing)


Documentation: [mbodied agents docs](https://mbodi-ai-mbodied-agents.readthedocs-hosted.com/en)

Example colab: [![Example Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/16liQspSIzRazWb_qa_6Z0MRKmMTr2s1s?usp=sharing)

# mbodied agents
Welcome to **mbodied agents**, a toolkit for integrating state-of-the-art transformers into robotics systems. The goals for this repo are to minimize the ambiguouty, heterogeneity, and data scarcity currently holding generative AI back from wide-spread adoption in robotics. It provides strong type hints for the various types of robot actions and provides a unified interface for:

Expand All @@ -17,7 +19,7 @@ Welcome to **mbodied agents**, a toolkit for integrating state-of-the-art transf
- Automatically recording observations and actions to hdf5
- Exporting to the most popular ML formats such as [Gym Spaces](https://gymnasium.farama.org/index.html) and [Huggingface Datasets](https://huggingface.co/docs/datasets/en/index)

And most importantly, the entire library is __100% configurable to any observation and action space__. That's right. With **mbodied agents**, the days of wasting precious engineering time on tedious formatting and post-processing are over. Jump to [Getting Started](#getting-started) to get up and running on [real hardware](https://colab.research.google.com/drive/1DAQkuuEYj8demiuJS1_10FIyTI78Yzh4?usp=sharing) or a [mujoco simulation](https://colab.research.google.com/drive/1sZtVLv17g9Lin1O2DyecBItWXwzUVUeH)
And most importantly, the entire library is __100% configurable to any observation and action space__. That's right. With **mbodied agents**, the days of wasting precious engineering time on tedious formatting and post-processing are over. Jump to [Getting Started](#getting-started) to get up and running on [real hardware](https://colab.research.google.com/drive/16liQspSIzRazWb_qa_6Z0MRKmMTr2s1s?usp=sharing) or a [mujoco simulation](https://colab.research.google.com/drive/1sZtVLv17g9Lin1O2DyecBItWXwzUVUeH)


## Updates
Expand All @@ -38,18 +40,11 @@ Please join our [Discord](https://discord.gg/RNzf3RCxRJ) for interesting discuss

- [Mbodied Agents](#mbodied-agents)
- [Overview](#overview)
- [Support Matrix](#support-matrix)
- [Installation](#installation)
- [Dev Environment Setup](#dev-environment-setup)
- [Getting Started](#getting-started)
- [Glossary](#glossary)
- [Building Blocks](#building-blocks)
- [The Sample class](#the-sample-class)
- [Message](#message)
- [Backend](#backend)
- [Cognitive Agent](#cognitive-agent)
- [Controls](#controls)
- [Hardware Interface](#hardware-interface)
- [Recorder](#recorder)
- [Directory Structure](#directory-structure)
- [Contributing](#contributing)

Expand Down Expand Up @@ -85,6 +80,26 @@ If you would like to integrate a new backend, sense, or motion control, it is ve

`pip install mbodied-agents`

## Dev Environment Setup

1. Clone this repo:

```console
git clone https://github.com/MbodiAI/mbodied-agents.git
```

2. Install system dependencies:

```console
source install.bash
```

3. Then for each new terminal, run:

```console
hatch shell
```

## Getting Started

### Real Robot Hardware
Expand Down
Binary file modified assets/architecture.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
700 changes: 0 additions & 700 deletions examples/simple_robot_agent.ipynb

This file was deleted.

33 changes: 18 additions & 15 deletions examples/simple_robot_agent.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
# Copyright 2024 Mbodi AI
#
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#
# https://www.apache.org/licenses/LICENSE-2.0
#
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
Expand All @@ -18,14 +18,14 @@
import click
from pydantic import BaseModel, Field
from pydantic_core import from_json
from gym import spaces
from gymnasium import spaces

from mbodied_agents.agents.language import CognitiveAgent
from mbodied_agents.agents.sense.audio_handler import AudioHandler
from mbodied_agents.agents.language import LanguageAgent
from mbodied_agents.agents.sense.audio.audio_handler import AudioHandler
from mbodied_agents.base.sample import Sample
from mbodied_agents.hardware.sim_interface import SimInterface
from mbodied_agents.types.controls import HandControl
from mbodied_agents.types.vision import Image
from mbodied_agents.types.sense.vision import Image
from mbodied_agents.data.recording import Recorder


Expand Down Expand Up @@ -53,9 +53,9 @@ class AnswerAndActionsList(Sample):
)


# This prompt is used to provide context to the CognitiveAgent.
# This prompt is used to provide context to the LanguageAgent.
SYSTEM_PROMPT = f"""
You are a robot with vision capabilities.
You are a robot with vision capabilities.
For each task given, you respond in JSON format. Here's the JSON schema:
{AnswerAndActionsList.model_json_schema()}
"""
Expand All @@ -69,32 +69,34 @@ def main(backend: str, disable_audio: bool, record_dataset: bool) -> None:
"""Main function to initialize and run the robot interaction.
Args:
backend: The backend to use for the CognitiveAgent (e.g., "openai").
backend: The backend to use for the LanguageAgent (e.g., "openai").
disable_audio: If True, disables audio input/output.
record_dataset: If True, enables recording of the interaction data for training.
Example:
To run the script with OpenAI backend and disable audio:
python script.py --backend openai --disable_audio
"""
# Initialize the intelligent Robot Agent.
robot_agent = CognitiveAgent(context=SYSTEM_PROMPT, api_service=backend)
# Initialize the intelligent Robot Agent with language interface.
robot_agent = LanguageAgent(context=SYSTEM_PROMPT, api_service=backend)

# Use a mock robot interface for movement visualization.
robot_interface = SimInterface()

# Enable or disable audio input/output capabilities.
if disable_audio:
os.environ["NO_AUDIO"] = "1"
audio = AudioHandler(use_pyaudio=False) # Prefer to use use_pyaudio=False for MAC.
# Prefer to use use_pyaudio=False for MAC.
audio = AudioHandler(use_pyaudio=False)

# Data recorder for every conversation and action.
if record_dataset:
observation_space = spaces.Dict({
'image': Image(size=(224, 224)).space(),
'instruction': spaces.Text(1000)
})
action_space = AnswerAndActionsList(actions=[HandControl()] * 6).space()
action_space = AnswerAndActionsList(
actions=[HandControl()] * 6).space()
recorder = Recorder(
'example_recorder',
out_dir='saved_datasets',
Expand All @@ -116,7 +118,8 @@ def main(backend: str, disable_audio: bool, record_dataset: bool) -> None:
print("Response:", response)

# Validate the response to the pydantic object.
answer_actions = AnswerAndActionsList.model_validate(from_json(response))
answer_actions = AnswerAndActionsList.model_validate(
from_json(response))

# Let the robot speak.
if answer_actions.answer:
Expand Down
Loading

0 comments on commit 8218d28

Please sign in to comment.