Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent state when cropping a hi-res image failed because of timeout #3816

Open
ornel-lloyd-edano opened this issue Jul 18, 2022 · 2 comments

Comments

@ornel-lloyd-edano
Copy link
Contributor

ornel-lloyd-edano commented Jul 18, 2022

Short description

When receiving status 500 (presumably because of timeout) during cropping a hi-res image, the metadata of the image in S3 and ES (Elasticsearch) can end up in an inconsistent state. This inconsistency leads to:

  • Failure to download the cropped image even if its thumbnail appear on the left side panel of cropped images (happens if there is inconsistency between S3 and ES).
  • Failure to display the cropped images as thumbnails on the left side panel (happens if there is inconsistency within S3)

Steps to reproduce

This can be quite tricky to reproduce. Cropping a hi resolution image often lead to status code 502, in which case no inconsistency in S3 and ES is observed. Perhaps we can simulate a network partition by intentionally throwing an unhandled exception somewhere during the call to POST /crop

Actual results

Assets did not have the expected widths for 1000.jpg and 10379.jpg (there are only 3 out of 5 in assets)
Screen Shot 2022-07-18 at 12 32 50 PM

Exports did not have crop id 266_0_10379_5843
Screen Shot 2022-07-18 at 12 37 00 PM

Expected results

There should be 5 out of 5 images in assets
Screen Shot 2022-07-18 at 12 56 33 PM

As there were 2 items in cropper data, there should also be 2 items in image exports
Screen Shot 2022-07-18 at 1 03 09 PM

OS and browser details

MacOS Monterey 12.4, Chrome 103

@ornel-lloyd-edano
Copy link
Contributor Author

ornel-lloyd-edano commented Jul 18, 2022

Proposed solutions

  1. Place a limit on the dimensions of the image that can be cropped
  2. No cropping dimension limits but make the call to image cropping asynchronous
  3. Keep synchronous call but do not display the cropped image thumbnails on the left side panel if the corresponding crop id from S3 is not found in image exports of ES
  4. Keep synchronous call and keep displaying the cropped image thumbnails on the left side panel even if the corresponding crop id is not found in image exports of ES. But during download of the cropped image, do not check ES for the exports entry, just download cropped image directly from S3

@andrew-nowak
Copy link
Member

andrew-nowak commented Jul 26, 2022

There's 2 problems here: the inconsistency, and the fact that crops fail.

inconsistency

Broadly, this is down to the way that crops in the UI are driven by the cropper service, which draws its data from S3 rather than elasticsearch. As far as I'm aware this is unique - every other piece of data shown in grid comes from elasticsearch via media-api. I think then that the solution to the inconsistency is to draw crops data from the media-api response(s); as far as I can tell all of the required data is included, the only functionality difference is that the cropper controller adds the download links - these could (should) also be added to the media-api response.

crops fail

This is due to a limit on how much resource can be used. Cropping can be expensive depending on how big the image is, and it is also possible for users to request lots of expensive crops simultaneously - in fact we provide this functionality by allowing users to select multiple images and create fullframe crops of all.


Regarding the proposed solutions:
adding a limit on dimensions will be extremely unpopular with users (although worth noting that whatever other option we take, a de facto limit will exist; hopefully so large that users never encounter it).

Making the cropping call asynchronous will be the real solution - I have ambitions in this area but I think it will require architectural changes so unfortunately is not imminent.

For now the best I can suggest is to fix the inconsistency by using the already-fetched image data to fill the crops panel and generating the download urls in the media-api response, and then to review the scaling of cropper instances (instance size and quantity) and the resources made available to imagemagick on each instance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants