Skip to content

Yet another Toy Pretrain(able) Autoregressive Transformer

Notifications You must be signed in to change notification settings

ritzfy/toy-part

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Story Generator using Pretrained Autoregressive Transformer Model

Overview

This is a Python implementation of a story generator using a transformer model. The model is trained on the TinyStoriesV2 dataset and can complete stories based on a given prompt.

Features

  • Generates stories based on a given prompt
  • Uses a transformer model to generate text
  • Includes data loading and preprocessing utilities
  • Supports training and evaluation of the model

Technical Details

  • The model is implemented using torch and its scaled dot-product multi-head attention implementation
  • Tokenizer used is tiktoken
  • The model is trained using a custom training loop utilizing cosine annealing and learning rate warmup

How to Use

  • Install the required dependencies using pip install -r requirements.txt
  • Download your dataset and place it in the data directory
  • Train the model using python main.py
  • Generate stories using python generate.py
  • Deploy the streamlit app using python app.py

Future Improvements:

  • Implement control using a configuration file.
  • Explore different model architectures and hyperparameters.
  • Integrate larger and more diverse datasets for training.
  • Add functionality for user-specified story themes or genres.

Author

Ritav Jash

License

This project is licensed under the MIT License.

About

Yet another Toy Pretrain(able) Autoregressive Transformer

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages