-
-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Detecting UPC as product picture when >50% of an image is the UPC to reduce low quality main images #1056
Comments
Reed andreas has been developping a module that looks very promising to detect UPCs. However we still have to decide what do we do with such detections (in case they are accurate). I would be in favor of keeping the image, but unselecting the image everywhere (front, nutriment, packaging, ingredient). What's your opinion on this @teolemon @alexgarel? |
That's my opinion as well. I've spotted some contributors explicitly selecting barcodes photos as a fallback sometimes when no other photo is available. |
I agree. The intent was to remove the UPC as a featured image, not to suggest it be deleted. The UPC image does have value, but not as the main photo or other fields with specific focus. |
Does anyone have good suggestions for how to test this flow? I cloned robotoff and the openfoodfacts-server (dealing with a bit of dependency issues but I should be able to get it up and running). In addition, I am following the guide on how to add a new predictor. Is there a way I can simulate a new image upload or test individual methods. Any help is greatly appreciated! Also if you want to take a look at my additions the branch related to the issue is UPC-Image-Predictor. |
@ReedAndreas to test the full pipeline, the easiest way it to send a webhook request to Robotoff API: edit: I've added the |
@ReedAndreas I forgot that local testing has always been cumbersome, as Robotoff performs many checks against the MongoDB during image import and during insight/prediction generation and import (such as: does the product exist, does the product has the image linked to the insight,...). |
Hey, thanks so much for the help and added testing ability! This really aided me in making some good progress. I was able today to test the whole flow and using logger I saw that the predictor is working as expected on different product images I tested it on!! Now I just need to take that result and create the proper prediction to return etc. Any thoughts on how I should structure the prediction I return? The predictor itself is generating two data points, one is whether or not it is a "UPC_Image" and the other is which class it is either "UPC_Image", "Small_UPC" (There is a UPC but this may still be a good main image), "No_UPC" which was mainly for my own purposes during testing since we really only care about UPC and everything else but still might be helpful to store the predicted class. I think I have a decent idea of what could work based on the other predictors but was not sure if you had any tips or suggestions. |
To avoid overloading the prediction table with millions of additional datapoints (we have 7M images in production), I would suggest to only create a prediction if we're quite sure the image is an UPC. |
Closing this issue, as #1098 has been merged. |
Problem
I'm often coming across product listing where the UPC is the main image of the product with more than 50% of the main photo just being UPC barcode lines. An example: https://world.openfoodfacts.org/images/products/003/800/023/1513/2.100.jpg (UPC removed from affected product page)
Proposed solution
Additional context
In hunger, behavior would be reverse of norms instead of adding a tag, it removes a photo.
Mockups
Part of
The text was updated successfully, but these errors were encountered: