[Feature Request]: Add watchdog / auto kill functionality to "Execute a Process" actions. #4748

usbrandon · 2024-12-29T19:50:04Z

What would you like to happen?

Many times when functionality is limited within Hop it is useful to execute a process that carries out the tasks and then returns. Problems arise when the process hangs or never finishes. I would like to request some kind of traditional watchdog timer / parameterization that would automatically kill the process if it takes to long or never occurs so my pipelines and workflows can eventually finish.

Issue Priority

Priority: 3

Issue Component

Component: Transforms

mattcasters · 2025-01-06T10:24:14Z

I would perhaps make this a general option in the Workflow and Pipeline run configurations.
In case the timeout option is set, a separate thread will be started alongside the Workflow or Pipeline to stop after the given time.

bamaer · 2025-01-06T18:41:10Z

if we want to kill pipelines/workflows based on a timeout or other parameters (available disk space etc), we could also look at restart options (nb of retries, wait time between retries).

mattcasters · 2025-01-08T16:41:23Z

Hi @bamaer. These are great ideas for sure. The Apache Airflow "Retry" task can do these things, and many enterprise schedulers as well. What you usually can't do however, is have the Hop process itself killed if it takes too long to complete. That is what this issue tries to solve.
Doing a re-start of pipelines and workflows, even remembering until where a workflow executed succesfully and restarting from that point, is something that can be done. I wrote something similar for another similar tool many years ago. Perhaps it deserves its own issue.

usbrandon · 2025-01-08T19:48:24Z

Not sure if I intended to have Hop kill itself after some timeout, but the point was that I may build a bash command line that runs a docker container with a python script inside. If anything goes wrong or hangs in the process that Hop let's it start, then there is no recovery but to kill all of Hop. I wish that I could kill off that process that I had hop start in within the pipeline, or workflow, whichever causes the sidecar execution. I agree on Airflow. We are exploring that. I am interested in how SLA's are declared and to what effect. From what I understand they are only an event marker and Airflow does nothing about it but note that such and such an SLA was violated, but it doesn't seem to kill the job or trigger other logic. I am still exploring so I might come back with a different understanding. Please teach me if you know differently. It is very appealing to have a single pane of glass for scheduling. We have a mixture of things executing like serverless functions, scripts that put elements on an SQS queue that cause processing, and then Hop for ingestion of landed files once the scaled up serverless data gathering tasks are done downloading where Hop can see them. Thanks for the consideration on this request. I hope the extra context helps.

…

On Wed, Jan 8, 2025 at 10:41 AM Matt Casters ***@***.***> wrote: Hi @bamaer <https://github.com/bamaer>. These are great ideas for sure. The Apache Airflow "Retry" task can do these things, and many enterprise schedulers as well. What you usually can't do however, is have the Hop process itself killed if it takes too long to complete. That is what this issue tries to solve. Doing a re-start of pipelines and workflows, even remembering until where a workflow executed succesfully and restarting from that point, is something that can be done. I wrote something similar for another similar tool many years ago. Perhaps it deserves its own issue. — Reply to this email directly, view it on GitHub <#4748 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAJNF5SOAMNKVI4IDSLEW4L2JVIMVAVCNFSM6AAAAABULCSOCWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKNZYGEZTCNRTHA> . You are receiving this because you authored the thread.Message ID: ***@***.***>

rvirgolireply · 2025-01-15T15:37:07Z

I would perhaps make this a general option in the Workflow and Pipeline run configurations. In case the timeout option is set, a separate thread will be started alongside the Workflow or Pipeline to stop after the given time.

Hey, this would be really appreciated! I was about to open a feature request on this exact topic. Being able to stop a workflow or a pipeline after a given time would be so handy!

usbrandon added new feature awaiting triage labels Dec 29, 2024

github-actions bot added P3 Nice to have Transforms labels Dec 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request]: Add watchdog / auto kill functionality to "Execute a Process" actions. #4748

[Feature Request]: Add watchdog / auto kill functionality to "Execute a Process" actions. #4748

usbrandon commented Dec 29, 2024

mattcasters commented Jan 6, 2025

bamaer commented Jan 6, 2025

mattcasters commented Jan 8, 2025

usbrandon commented Jan 8, 2025 via email

rvirgolireply commented Jan 15, 2025

[Feature Request]: Add watchdog / auto kill functionality to "Execute a Process" actions. #4748

[Feature Request]: Add watchdog / auto kill functionality to "Execute a Process" actions. #4748

Comments

usbrandon commented Dec 29, 2024

What would you like to happen?

Issue Priority

Issue Component

mattcasters commented Jan 6, 2025

bamaer commented Jan 6, 2025

mattcasters commented Jan 8, 2025

usbrandon commented Jan 8, 2025 via email

rvirgolireply commented Jan 15, 2025