You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
rstarmer
changed the title
Train an RETROformer based on Deepmind Retroformer model
Load data on an RETROformer based on Deepmind Retroformer model
Jun 28, 2023
rstarmer
added
P0
Priority 0 - essential
and removed
EXPLORATION
An exploration PR contains too much code to ever be mergeable. It is useful to communicate ideas.
labels
Jun 28, 2023
@MostAwesomeDude I added a branch platform/retro-pytorch to the repository with a current attempt at this. The train.py is simply code from the lucidrains/RETRO-pytorch readme.
You will still need to load some txt files into the text_folder for the system to ingest.
The RETRO model we are currently investigating: https://arxiv.org/pdf/2112.04426.pdf
An example implementation: https://github.com/lucidrains/RETRO-pytorch
Initial data set: https://huggingface.co/datasets/togethercomputer/RedPajama-Data-1T/tree/main
Acceptance criteria:
The text was updated successfully, but these errors were encountered: