Training for vertical Japanese texts. #1884

Subhajyotidas · 2023-04-28T03:00:03Z

Subhajyotidas
Apr 28, 2023

I have been trying to train mmocr for vertical Japanese texts (only Hiragana and Katakana).
I am trying to do it with SAR model.

_base_sar_resnet31_parallel-decoder.py file has some resizing features...

train_pipeline = [
    dict(type='LoadImageFromFile', ignore_empty=True, min_size=2),
    dict(type='LoadOCRAnnotations', with_text=True),
    dict(
        type='RescaleToHeight',
        height=48,
        min_width=48,
        max_width=160,
        width_divisor=4),
    dict(type='PadToWidth', width=160),
    dict(
        type='PackTextRecogInputs',
        meta_keys=('img_path', 'ori_shape', 'img_shape', 'valid_ratio'))
]

test_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(
        type='RescaleToHeight',
        height=48,
        min_width=48,
        max_width=160,
        width_divisor=4),
    dict(type='PadToWidth', width=160),
    # add loading annotation after ``Resize`` because ground truth
    # does not need to do resize data transform
    dict(type='LoadOCRAnnotations', with_text=True),
    dict(
        type='PackTextRecogInputs',
        meta_keys=('img_path', 'ori_shape', 'img_shape', 'valid_ratio'))
]

I tried to modify it for padding in a vertical manner with replacing width with height, like following.

train_pipeline = [
    dict(type='LoadImageFromFile', ignore_empty=True, min_size=2),
    dict(type='LoadOCRAnnotations', with_text=True),
    dict(
        type='RescaleToHeight',
        max_height=160,
        min_height=48,
        width=48,
        height_divisor=4),
    dict(type='PadToHeight', height=160),
    dict(
        type='PackTextRecogInputs',
        meta_keys=('img_path', 'ori_shape', 'img_shape', 'valid_ratio'))
]

test_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(
        type='RescaleToHeight',
        max_height=160,
        min_height=48,
        width=48,
        height_divisor=4),
    dict(type='PadToHeight', height=160),
    # add loading annotation after ``Resize`` because ground truth
    # does not need to do resize data transform
    dict(type='LoadOCRAnnotations', with_text=True),
    dict(
        type='PackTextRecogInputs',
        meta_keys=('img_path', 'ori_shape', 'img_shape', 'valid_ratio'))
]

But as expected it threw the following error.

TypeError: class EpochBasedTrainLoopin mmengine/runner/loops.py: classConcatDatasetin mmocr/datasets/dataset_wrapper.py: classRescaleToHeight in mmocr/datasets/transforms/textrecog_transforms.py: __init__() missing 1 required positional argument: 'height'

Issue: The results with training original resizing, provides very bad results, as it was meant for horizontal writing.

Question: Is there any way to resize the data meant for vertical OCR training?

Thank you in advance for any help.

gaotongxiao · 2023-04-28T04:33:42Z

gaotongxiao
Apr 28, 2023
Maintainer

If only vertical texts are presented in your dataset, you may consider rotating all the images by 90 degrees to fits the assumption.

4 replies

Subhajyotidas Apr 28, 2023
Author

If I do that, when the model is trained, while inferencing, I will also have to rotate the test images, right?

gaotongxiao Apr 28, 2023
Maintainer

Sure

Subhajyotidas Apr 28, 2023
Author

To your knowledge has anyone tried to train for vertical texts before with MMOCR?
Did they do in the same manner you advised?

gaotongxiao Apr 28, 2023
Maintainer

PaddleOCR has an image orientation classifier before their text detector & recognizer, based on which the image will be rotated beforehand. I believe it will work as the character orientation doesn't matter for a model trained from scratch.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training for vertical Japanese texts. #1884

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 4 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Training for vertical Japanese texts. #1884

Subhajyotidas Apr 28, 2023

Replies: 1 comment · 4 replies

gaotongxiao Apr 28, 2023 Maintainer

Subhajyotidas Apr 28, 2023 Author

gaotongxiao Apr 28, 2023 Maintainer

Subhajyotidas Apr 28, 2023 Author

gaotongxiao Apr 28, 2023 Maintainer

Subhajyotidas
Apr 28, 2023

Replies: 1 comment 4 replies

gaotongxiao
Apr 28, 2023
Maintainer

Subhajyotidas Apr 28, 2023
Author

gaotongxiao Apr 28, 2023
Maintainer

Subhajyotidas Apr 28, 2023
Author

gaotongxiao Apr 28, 2023
Maintainer