Yume is a Japanese LLM (Large Language Model) with 1.5 billion parameters, inspired by Andrej Karpathy. It is trained on dialogues from anime and manga, aimed at generating anime dialogues. Future plans include creating a better version, Yuumi, which will be a lightweight LLM for daily tasks.
- Large language model for Japanese
- Trained on anime and manga dialogues
- Configurable with various model sizes
- Supports pretraining and fine-tuning
- Integrates with Hugging Face for model management
You can use Yume to generate text samples. Here's an example:
from yume import Yume
from yume.config import yume_small
# Optional: Create a custom config if needed
# dummy_config = Config(...)
# Initialize the Yume model with a pre-defined small configuration
yume = Yume(config=yume_small)
# Load a pretrained model from the specified path
yume.load_pretrained('zaibutcooler/yume')
# Generate a sample with the prompt '犬とは' (What is a dog?)
yume.sample('犬とは')
You can also train Yume with your own dataset. Here’s how you can do it:
from yume import Yume
from yume.dataset import Trainset
from yume.config import yume_medium, Config
# Initialize the dataset with the desired URL
dataset = Trainset(dataset_url="zaibutcooler/nihon-wiki")
# Build the dataset
dataset.build_dataset()
# Optional: Create a custom config if needed
# dummy_config = Config(...)
# Initialize the Yume model with a pre-defined medium configuration
yume = Yume(config=yume_medium)
# Pretrain the model with the dataset
yume.pretrain(dataset)
# Optional: Fine-tune the model with the dataset
# yume.fine_tune(dataset)
# Optional: Upload the model to Hugging Face
# yume.huggingface_login("your_hf_tokens")
# yume.save_pretrained("zaibutcooler/yume")
This project is licensed under the MIT License. See the LICENSE file for details.
This project is inspired by Andrej Karpathy and utilizes dialogues from various anime and manga sources for training.