Story Generator using Pretrained Autoregressive Transformer Model
This is a Python implementation of a story generator using a transformer model. The model is trained on the TinyStoriesV2 dataset and can complete stories based on a given prompt.
- Generates stories based on a given prompt
- Uses a transformer model to generate text
- Includes data loading and preprocessing utilities
- Supports training and evaluation of the model
- The model is implemented using
torch
and its scaled dot-product multi-head attention implementation - Tokenizer used is
tiktoken
- The model is trained using a custom training loop utilizing cosine annealing and learning rate warmup
- Install the required dependencies using
pip install -r requirements.txt
- Download your dataset and place it in the data directory
- Train the model using
python main.py
- Generate stories using
python generate.py
- Deploy the streamlit app using
python app.py
- Implement control using a configuration file.
- Explore different model architectures and hyperparameters.
- Integrate larger and more diverse datasets for training.
- Add functionality for user-specified story themes or genres.
Ritav Jash
This project is licensed under the MIT License.