Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent loading of images #1339

Open
matskv opened this issue Nov 27, 2024 · 0 comments
Open

Inconsistent loading of images #1339

matskv opened this issue Nov 27, 2024 · 0 comments

Comments

@matskv
Copy link

matskv commented Nov 27, 2024

The reformat_input function in file easyocr/utils.py is inconsistent in the way it computes greyscale image-array img_cv_grey, and the format (RGB or BGR) of the 3-channel img image-array, depending on the type of the input image to reformat_input

  1. If image is a Path-like object or URL, the img will be an RGB image-array and img_cv_grey will be computed using cv2.imread(image, cv2.IMREAD_GRAYSCALE). So, these two are consistent. And furthermore, looking at the way the detector CRAFT is used, I assume that the image-array is supposed to be an RGB-ordered array.

  2. If image is of type bytes. Then the code reads
    nparr = np.frombuffer(image, np.uint8)
    img = cv2.imdecode(nparr, cv2.IMREAD_COLOR)
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    img_cv_grey = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    Here imgwill be an image-array ordered as RGB exactly as above. But the last line is clearly wrong, and should read
    img_cv_grey = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
    since img has been transforms to an RGB ordered image-array in the line above.

  3. If image is an numpy array, then img is ordered as BGR, which is different to the cases above.

  4. For PIL-JPEG image, img is again an BGR ordered image-array

If it is as I assume, that detector expects an RGB ordered image-array, I think that the cases of image being a numpy array of PIL-JPEG are both wrong. In any case, it is non-consistent to the cases of image being Path/URL-like or a byte-array.

Furthermore in case 1, the img_cv_grey is computed using the cv2.imread functionality, which produces a slightly different output than cv2.cvtColor(img, cv2.COLOR_RGB2GRAY). Also, since the image is read once more in the function loadImage in file easyocr/imgproc.py using the skimage.io.imread, I think it is unnecessary to read the image path twice. Better to load the image and then convert to img_cv_grey using cvtColor. I assume that that is cheaper computationally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant