Rt1-VIT-bert #14

Tilak1114 · 2024-06-09T14:29:15Z

Changed EfficientNet to ViT and Integrated BERT Text Encoder for `rt1`

Description

This update replaces the existing EfficientNet model with a Vision Transformer (ViT) and integrates a BERT text encoder for rt1. The current implementation includes a dummy sample to test the network functionality. The training part is not included in this update.

Summary:

Replaced EfficientNet with Vision Transformer (ViT).
Integrated BERT as the text encoder.
Implemented a dummy sample to test the network functionality.
The training part is not included in this update.

How Has This Been Tested?

The changes have been verified by running a dummy sample to ensure the network functionality works as expected.

Testing:

Ran a dummy sample to verify network functionality.

Checklist

Self-review
Documentation
Testing

…are-min

…1-vit

sebbyjp · 2024-06-10T18:23:21Z

src/mbodied_agents/agents/motor/motor_agent.py

motion is the correct folder name

sebbyjp · 2024-06-10T18:23:45Z

src/mbodied_agents/agents/motor/rt1/film_efficientnet/film.py

We only need this, vit, tokenizers, and learner

sebbyjp · 2024-06-10T18:30:51Z

src/mbodied_agents/agents/motor/rt1/film_efficientnet/efficient_net.py

delete if unneeded.

sebbyjp · 2024-06-10T18:38:17Z

src/mbodied_agents/agents/motor/rt1/rt1_agent.py

+            "cuda" if torch.cuda.is_available() else "cpu"
+        )
+        model_name = 'bert-base-uncased'
+        self.tokenizer = BertTokenizer.from_pretrained(model_name)


dont reinitialize every call

This call is in the constructor / __init__ method. So its only initialized once when we create the Rt1Agent object

sebbyjp · 2024-06-10T18:40:21Z

src/mbodied_agents/agents/motor/rt1/rt1_agent.py

observation, action, history size, image size should not be hard coded, use the current ones in the file as default values in case the config doesn't specify them.

sebbyjp · 2024-06-10T18:41:10Z

src/mbodied_agents/agents/motor/motor_agent.py

+from mbodied_agents.types.controls import Motion
+
+
+class MotorAgent(ABC):


inherits from Agent.

nqyy and others added 19 commits June 5, 2024 02:50

Add RT1 bare minimum model

c262487

Make rt1 nn modile

c4a2926

Fix device and remove std and mean in tokenizer

77b3c3a

Run dummy pass

e53e4cd

Add doctest

6688ddd

Make bounds none

d16ef3c

Add example for rt1

3c7baef

Fix broken test

a48c8a9

Add MotionAgent as the base class for rt1

88c0f5b

Merge branch 'main' of github.com:Tilak1114/mbodied-agents into rt1-b…

8d53b43

…are-min

Backmerge main

936b7d1

Add MotorAgent readme

3ce70c0

Add mbodi headers

0398b07

Add Motion instance check in test rt1

385d1e2

Add transfomer reference

e3ed6f6

rt1-vit

18a330b

Vit with film

2bec8d4

Modify imagetokenizer logic to handle vit output

770384e

Fix rt1 test

94e82f1

Tilak1114 changed the title ~~Add a vit rt1~~ Rt1-VIT-bert Jun 9, 2024

Merge branch 'MbodiAI:main' into rt1-vit

bb5e6ba

Tilak1114 marked this pull request as draft June 10, 2024 02:29

Tilak1114 marked this pull request as ready for review June 10, 2024 02:43

Tilak1114 added 2 commits June 10, 2024 13:05

Update comment dimension

7f4b1a0

Merge branch 'rt1-vit' of github.com:Tilak1114/mbodied-agents into rt…

d313da8

…1-vit

sebbyjp requested changes Jun 10, 2024

View reviewed changes

Make rt1 agent obs, action, history and image size configurable

6493948

nqyy force-pushed the main branch 3 times, most recently from b560b84 to 1a6af9c Compare June 17, 2024 01:41

nqyy force-pushed the main branch from bacecb7 to 012a519 Compare June 18, 2024 02:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rt1-VIT-bert #14

Rt1-VIT-bert #14

Tilak1114 commented Jun 9, 2024 •

edited

Loading

sebbyjp Jun 10, 2024

sebbyjp Jun 10, 2024

sebbyjp Jun 10, 2024

sebbyjp Jun 10, 2024

Tilak1114 Jun 10, 2024 •

edited

Loading

sebbyjp Jun 10, 2024

sebbyjp Jun 10, 2024

		from mbodied_agents.types.controls import Motion


		class MotorAgent(ABC):

Rt1-VIT-bert #14

Are you sure you want to change the base?

Rt1-VIT-bert #14

Conversation

Tilak1114 commented Jun 9, 2024 • edited Loading

Changed EfficientNet to ViT and Integrated BERT Text Encoder for rt1

Description

How Has This Been Tested?

Checklist

sebbyjp Jun 10, 2024

Choose a reason for hiding this comment

sebbyjp Jun 10, 2024

Choose a reason for hiding this comment

sebbyjp Jun 10, 2024

Choose a reason for hiding this comment

sebbyjp Jun 10, 2024

Choose a reason for hiding this comment

Tilak1114 Jun 10, 2024 • edited Loading

Choose a reason for hiding this comment

sebbyjp Jun 10, 2024

Choose a reason for hiding this comment

sebbyjp Jun 10, 2024

Choose a reason for hiding this comment

Tilak1114 commented Jun 9, 2024 •

edited

Loading

Changed EfficientNet to ViT and Integrated BERT Text Encoder for `rt1`

Tilak1114 Jun 10, 2024 •

edited

Loading