Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(server): deprioritize METADATA_EXTRACTION to let LINK_LIVE_PHOTOS complete first #12418

Closed
wants to merge 1 commit into from

Conversation

mPyKen
Copy link
Contributor

@mPyKen mPyKen commented Sep 7, 2024

Least intrusive change to partially fix #12306.

When stopping and clearing the METADATA_EXTRACTION queue midway, METADATA_EXTRACTION queue must be forcefully filled and run again to LINK_LIVE_PHOTOS as LINK_LIVE_PHOTOS jobs are processed after all the METADATA_EXTRACTION jobs. Hence, LINK_LIVE_PHOTOS jobs should be prioritized over METADATA_EXTRACTION.

This also fixes UX such that the queue size actually decreases on the job status page evenly over time.

@mertalev
Copy link
Contributor

Hi! Thank you for the PR, but we can't merge this as it is. Metadata extraction is required for most other jobs, so deprioritizing it is not a good idea.

The situation you're describing can be solved in other ways instead, such as by tracking live photo linking separately from metadata extraction in the job status table.

@jrasm91
Copy link
Contributor

jrasm91 commented Sep 16, 2024

Closing per merts comment.

@jrasm91 jrasm91 closed this Sep 16, 2024
@mPyKen
Copy link
Contributor Author

mPyKen commented Sep 17, 2024

@mertalev My understanding is that priority affects only jobs in the same queue, not across queues. So in this case, deprioritizing metadata extraction only means prioritizing JobName.QUEUE_METADATA_EXTRACTION and JobName.LINK_LIVE_PHOTOS over JobName.METADATA_EXTRACTION as these 3 are the only ones appearing in QueueName.METADATA_EXTRACTION at the moment. I don't see why this would be a problem.

If you say you want to fix the original issue differently, sure. This was supposed to be a quick fix so that others do not run into the same issue as I did.

@mertalev
Copy link
Contributor

The issue is that the jobs in the other queues can't run until metadata extraction is finished, so lowering its priority has a cascade effect on the job system.

@mPyKen
Copy link
Contributor Author

mPyKen commented Sep 17, 2024

For JobName.QUEUE_METADATA_EXTRACTION Q, JobName.LINK_LIVE_PHOTOS L and JobName.METADATA_EXTRACTION M, we currently complete jobs of METADATA_EXTRACTION queue in this order:
QMMMMMLLLLL
whereas this PR changes this order to
QMLMLMLMLML
The first jobs in other queues will still start after the first M job. As LINK_LIVE_PHOTOS is just a couple of database queries, I thought it wouldn't affect the job system that much. Is that not the case?

@mertalev
Copy link
Contributor

I've seen cases where metadata extraction was the bottleneck and the other queues were waiting on it. I just think a different approach would accomplish the same result without that tradeoff.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

LINK_LIVE_PHOTOS jobs lost when clearing queue after METADATA_EXTRACTION job completes
3 participants