Automatic retry of triggered pipelines #809

e-gineer · 2024-03-18T19:54:31Z

It can be difficult to do development and debugging of the query trigger. Specifically, results are captured and set before the pipeline is actually run and has succeeded. This often leads to a situation where problems in the pipeline cannot be easily resolved - you have to wait for the next event to come along.

As of March 2024, we should implement these CLI changes and commands to help:

trigger run - Add ability to run a trigger from a flowpipe command #740 (comment)
trigger keys & reset - Add ability to reset DB for triggers #739 (comment)

In the future it might be good to allow automatic retry of triggers if the associated pipeline fails. The rest of this document aims to describe research into that behavior.

How should we handle failures?

Insert
- optimistically accept the insert immediately, adding it to the list of keys (like we do now)
- it will no longer be captured by future queries
- but, if the pipeline fails, have an option to remove it from the keys (so it will be picked up next time automatically)
delete
- optimistically accept the delete immediately, removing it from the list of keys (like we do now)
- it will no longer be captured by future queries
- but, if the pipeline fails, have an option to add it back to the keys (so it will be picked up next time automatically)
update
- updates are harder, there may be multiple and they are difficult to tell apart
- optimistically accept the update immediately, updating the hash context (like we do now)
- further updates will create new events, but the same content will not be treated as an update
- but, if the pipeline fails, we should reset the update - but only if it was the most recent update. if there were other updates since then, then just ignore the error.

Proposed syntax for the above would be to support retry.max_attempts:

# Support retry with max_attempts?
trigger "query" "my_qt" {
  pipeline = pipeline.my_pipeline
  args = {
    event = self.request_body
  }
  retry {
    max_attempts = 5
  }
}

Questions / related comments:

Should each update be retried? This seems like a lot and the update case seems less important in general.
Should we add version number / created_at / updated_at metadata to the capture? This might allow more custom handling or more complex update cases?
Capture data would need to include a retry count for each key and situation to track attempts before failing.
Perhaps query handling should be FIFO on a per primary key basis? e.g. should we ensure the insert completes before the update or delete runs? should we stop multiple updates from running in parallel?
Should the block be named retry_pipeline rather than retry?

The text was updated successfully, but these errors were encountered:

e-gineer added the enhancement New feature or request label Mar 18, 2024

e-gineer closed this as completed Mar 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Automatic retry of triggered pipelines #809

Automatic retry of triggered pipelines #809

e-gineer commented Mar 18, 2024

Automatic retry of triggered pipelines #809

Automatic retry of triggered pipelines #809

Comments

e-gineer commented Mar 18, 2024