Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug Report: Incorrect Character Box Order for Inverted Text when return_word_box is enabled #328

Open
ajaivasudeve opened this issue Jan 20, 2025 · 0 comments

Comments

@ajaivasudeve
Copy link

Description

When processing inverted text using PaddleOCR, the OCR engine successfully detects the correct text content (because angle cls is enabled). However, the order of the character-level bounding boxes remains inconsistent with the orientation of the original input text.

For instance:
When the text is inverted (right-to-left in the image), PaddleOCR correctly detects the text as left-to-right in content. However, the character bounding boxes should follow the original right-to-left order of the inverted input.
Conversely, when text is naturally written left-to-right, the bounding boxes are ordered correctly.

This creates confusion when the bounding box coordinates are used for downstream processing or analysis.
Steps to Reproduce

Input an image containing inverted text (right-to-left visually but left-to-right in logical content).
Use PaddleOCR to extract character-level bounding boxes and text content.
Observe the order of the bounding boxes:
    The text content is correctly detected as left-to-right.
    However, the character bounding boxes do not match the original input's visual (right-to-left) orientation.

Expected Behavior

For inverted text:

The character-level bounding boxes should follow the original orientation of the text in the input image (right-to-left).
The logical text content can still be left-to-right, as detected.

Actual Behavior

The bounding boxes are incorrectly ordered as left-to-right, regardless of the input text orientation.

Example Data

Input 1:

Original text is inverted (right-to-left).
Bounding box coordinates:

[[[249, 369], [263, 368], [263, 394], [249, 395]],
 [[267, 368], [283, 367], [283, 393], [267, 394]],
 ...
]

Detected content: left-to-right (correct).
Bounding box order: left-to-right (incorrect, should match the input's right-to-left orientation).

Input 2:

Normal left-to-right text.
Bounding box coordinates:

[[[259, 707], [277, 707], [278, 734], [260, 734]],
 [[277, 707], [296, 706], [297, 733], [278, 734]],
 ...
]

Detected content: left-to-right (correct).
Bounding box order: left-to-right (correct).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant