-
Notifications
You must be signed in to change notification settings - Fork 241
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retry failed jobs #21
Comments
Doesn't Resque do this though? Or do a rescue in Go? |
Hi, Is it possible with the current version? Can anyone tell any workaround for it? |
@rohit4813 The current implementation listens for few signals and if it receives them, it stops enqueuing new jobs but lets the running jobs finish. I'm not sure I understand exactly what you're trying to do, but here's our use: We have a scenario where each job is potentially pretty long but has a natural stopping point. For this, we create a channel in the main function that gets written into when a signal is received (basically he same code that is in goworker already). Then we create another channel, this time buffered (capacity = number of workers) and pass that channel to each worker. Workers then The whole flow looks like:
|
@mingan Thanks for the great explanation. If I understand correctly, all the workers will read from the workers channel(which gets populated from the signals channel)? And I have a use case where I want to stop single/multiple worker(s), say which are running for a very long time and if that is the case all the workers will stop on passing the signal to the channel. I can identify the worker on which the job is running for a very long time. How can I send the signal to this particular worker? Hope this is not confusing, or am I missing something. |
@rohit4813 Our use case just creates breakpoints in long-running jobs so that when we need to restart the process, we don't have to wait (tens of) minutes for the whole job to finish. If you needed to discriminate between workers, I guess you could do that by sending some meaningful value through the channel and then the worker would decide "this msg is meant for me, I'll stop" or "this is meant for the slow one over there, I can keep running". Though, I can't imagine the use case for such behaviour. |
It would be nice to have features like sidekiq provides (https://github.com/mperham/sidekiq/wiki/Error-Handling), especially retry failed jobs.
Something like:
"If you don't fix the bug within 25 retries (about 21 days), Sidekiq will stop retrying and move your job to the Dead Job Queue. You can fix the bug and retry the job manually anytime within the next 6 months using the Web UI."
The text was updated successfully, but these errors were encountered: