-
-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update the training scripts to incorporate the latest versions of Transformers and other dependencies. #84
base: master
Are you sure you want to change the base?
Conversation
@@ -48,7 +47,7 @@ def get_model(encoder_name, decoder_name, max_length, num_decoder_layers=None): | |||
|
|||
decoder_config.num_hidden_layers = num_decoder_layers | |||
|
|||
config = VisionEncoderDecoderConfig.from_encoder_decoder_configs(encoder.config, decoder.config) | |||
config = VisionEncoderDecoderConfig.from_encoder_decoder_configs(encoder_config, decoder_config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably this is a existing bug ?
Otherwise, the saved config file and model will be different, i.e.: in the config for decoder, there are more layers than the model. When load the model for inference, those layers will be randomly initialized.
@jzhang533 Thanks, the comments were very helpful. Most of the changes seem fine, I'll just have to take a look at that thing with configs. |
Thanks for review. I think I have reproduced the training of the model in my repository: https://github.com/jzhang533/manga-ocr. I achieved a CER of 0.1017, which is comparable to the original model's CER of 0.1056 on the Manga109s test split. However, the dataset split I used might differ from the one you used when training the model several years ago. That is because there are randomness involved in splitting the Manga109s dataset here: https://github.com/kha-white/manga-ocr/blob/master/manga_ocr_dev/data/process_manga109s.py#L82. For the model config, I have updated the model config scripts to https://github.com/jzhang533/manga-ocr/blob/master/manga_ocr_dev/training/my_get_model.py, you may have interest to take a look. |
@kha-white I have uploaded my trained model to HF: https://huggingface.co/jzhang533/manga-ocr-base-2025 by running |
@kha-white
Thanks for your great project, I've played a little bit to run the training scripts.
but found several issues related to the latest dependencies, mainly caused by development iteration of transformers, albumentations, etc. in recent years.
Finally make it work after several tweaks, will explain the reasons by commenting this PR.