Skip to content

Generative model with architectures from a bigram to a transformer.

License

Notifications You must be signed in to change notification settings

T4ras123/Rewrite-Makemore

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 

Repository files navigation

Makemore Implementation for Learning Purposes

This repository contains my implementation of the makemore project by Andrej Karpathy. The goal of this project is to learn about language models (LLMs) through hands-on coding and experimentation.

Implemented Models

Bigram Model

  • Description: A simple bigram model with a context size of 1 character.
  • Loss: Achieves a negative log-likelihood loss of approximately 2.7.

Linear Neural Network Model

  • Description: A linear neural network model that mirrors the bigram model in terms of loss and results. The probabilities are learned rather than counted.
  • Loss: Similar to the bigram model, with a negative log-likelihood loss of approximately 2.7.

Multi-Layer Perceptron (MLP)

  • Description: An MLP with a 27 x 10 embedding space and a context size of 6 characters.
  • Loss: Improved performance over the bigram and linear models.

Notebooks

Data

  • names.txt: Dataset used for training and evaluation.

Getting Started

To get started with this project, clone the repository and open the Jupyter notebooks in your preferred environment. Each notebook contains detailed explanations and code for the respective models.

git clone https://github.com/yourusername/makemore-implementation.git
cd makemore-implementation

About

Generative model with architectures from a bigram to a transformer.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published