Gemma Czech Adaptation

This repository focuses on adapting Gemma, a pre-trained language model, for the Czech language. We will compile our own datasets from Hugging Face's datasets containing Czech data to fine-tune the model effectively.

This repository serves as our submission for the Gemma Language Model Tuning Competition, where we aim to optimize Gemma's performance on Czech language tasks.

Take this repository's implementation as a proof of concept, for better results we'd need more structured data, optimize the hyperparams and scale the training which will require significantly more resources.

There's a still a possibility for more efficient fine tune (training) using RL techniques such PPO, TRPO, GRPO or alignment techniques such as DPO and maybe some transfer learning.

Objectives

Fine-tune Gemma to improve its understanding of the Czech language.
Evaluate the model's performance on tasks like translation, sentiment analysis, and natural language generation in Czech.
Deploy the fine-tuned model for practical applications.

Repository Structure

data/: Directory for datasets used in training and evaluation.
models/: Directory to save trained models and checkpoints.

Usage

You can either run the poc (proof of concept) notebook or the final submission one. Then monitor training progress and evaluate results on Czech-specific tasks.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.vscode		.vscode
data		data
models		models
.gitignore		.gitignore
README.MD		README.MD
poc.ipynb		poc.ipynb
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
submission.ipynb		submission.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Gemma Czech Adaptation

Objectives

Repository Structure

Usage

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

quant-eagle/gemma-global-competition

Folders and files

Latest commit

History

Repository files navigation

Gemma Czech Adaptation

Objectives

Repository Structure

Usage

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages