Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the training scripts to incorporate the latest versions of Transformers and other dependencies. #84

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

jzhang533
Copy link

@jzhang533 jzhang533 commented Dec 10, 2024

@kha-white

Thanks for your great project, I've played a little bit to run the training scripts.
but found several issues related to the latest dependencies, mainly caused by development iteration of transformers, albumentations, etc. in recent years.

Finally make it work after several tweaks, will explain the reasons by commenting this PR.

manga_ocr_dev/data/process_manga109s.py Show resolved Hide resolved
manga_ocr_dev/training/dataset.py Show resolved Hide resolved
manga_ocr_dev/training/dataset.py Show resolved Hide resolved
manga_ocr_dev/training/dataset.py Show resolved Hide resolved
manga_ocr_dev/training/dataset.py Show resolved Hide resolved
manga_ocr_dev/training/metrics.py Show resolved Hide resolved
manga_ocr_dev/training/train.py Show resolved Hide resolved
manga_ocr_dev/training/train.py Show resolved Hide resolved
manga_ocr_dev/training/train.py Show resolved Hide resolved
manga_ocr_dev/training/train.py Show resolved Hide resolved
@@ -48,7 +47,7 @@ def get_model(encoder_name, decoder_name, max_length, num_decoder_layers=None):

decoder_config.num_hidden_layers = num_decoder_layers

config = VisionEncoderDecoderConfig.from_encoder_decoder_configs(encoder.config, decoder.config)
config = VisionEncoderDecoderConfig.from_encoder_decoder_configs(encoder_config, decoder_config)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably this is a existing bug ?

Otherwise, the saved config file and model will be different, i.e.: in the config for decoder, there are more layers than the model. When load the model for inference, those layers will be randomly initialized.

manga_ocr_dev/training/metrics.py Show resolved Hide resolved
@kha-white
Copy link
Owner

@jzhang533 Thanks, the comments were very helpful. Most of the changes seem fine, I'll just have to take a look at that thing with configs.

@jzhang533
Copy link
Author

Thanks for review.

I think I have reproduced the training of the model in my repository: https://github.com/jzhang533/manga-ocr. I achieved a CER of 0.1017, which is comparable to the original model's CER of 0.1056 on the Manga109s test split. However, the dataset split I used might differ from the one you used when training the model several years ago. That is because there are randomness involved in splitting the Manga109s dataset here: https://github.com/kha-white/manga-ocr/blob/master/manga_ocr_dev/data/process_manga109s.py#L82.

For the model config, I have updated the model config scripts to https://github.com/jzhang533/manga-ocr/blob/master/manga_ocr_dev/training/my_get_model.py, you may have interest to take a look.

@jzhang533
Copy link
Author

@kha-white I have uploaded my trained model to HF: https://huggingface.co/jzhang533/manga-ocr-base-2025

by running manga_ocr -p "jzhang533/manga-ocr-base-2025", the warning reported in #85 disappeared.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants