Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rt1-VIT-bert #14

Open
wants to merge 23 commits into
base: main
Choose a base branch
from
Open

Rt1-VIT-bert #14

wants to merge 23 commits into from

Conversation

Tilak1114
Copy link

@Tilak1114 Tilak1114 commented Jun 9, 2024

Changed EfficientNet to ViT and Integrated BERT Text Encoder for rt1

Description

This update replaces the existing EfficientNet model with a Vision Transformer (ViT) and integrates a BERT text encoder for rt1. The current implementation includes a dummy sample to test the network functionality. The training part is not included in this update.

Summary:

  • Replaced EfficientNet with Vision Transformer (ViT).
  • Integrated BERT as the text encoder.
  • Implemented a dummy sample to test the network functionality.
  • The training part is not included in this update.

How Has This Been Tested?

The changes have been verified by running a dummy sample to ensure the network functionality works as expected.

Testing:

  • Ran a dummy sample to verify network functionality.

Checklist

  • Self-review
  • Documentation
  • Testing

@Tilak1114 Tilak1114 changed the title Add a vit rt1 Rt1-VIT-bert Jun 9, 2024
@Tilak1114 Tilak1114 marked this pull request as draft June 10, 2024 02:29
@Tilak1114 Tilak1114 marked this pull request as ready for review June 10, 2024 02:43
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

motion is the correct folder name

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only need this, vit, tokenizers, and learner

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete if unneeded.

"cuda" if torch.cuda.is_available() else "cpu"
)
model_name = 'bert-base-uncased'
self.tokenizer = BertTokenizer.from_pretrained(model_name)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dont reinitialize every call

Copy link
Author

@Tilak1114 Tilak1114 Jun 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This call is in the constructor / __init__ method. So its only initialized once when we create the Rt1Agent object

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

observation, action, history size, image size should not be hard coded, use the current ones in the file as default values in case the config doesn't specify them.

from mbodied_agents.types.controls import Motion


class MotorAgent(ABC):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inherits from Agent.

@nqyy nqyy force-pushed the main branch 3 times, most recently from b560b84 to 1a6af9c Compare June 17, 2024 01:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants