How to run parallel pipelines #2097
-
Hello o/ I'm having a little problem how to configure the pipeline with the parallel. I have 4 custom processors: "employee_mapper", "employee_reporter", "info_mapper", "info_reporter". The processer "employee_reporter "needs to run after the "employee_mapper" (we pass the result to report) and the same for "info" but the "employee group" can run in parallel with the info. I tried something like this:
Unfortunately I get this error:
How can I configure the pipeline? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Hey @eduardo-fp-romao 👋 I think you're confusing a few things here. The In your case, if the |
Beta Was this translation helpful? Give feedback.
Hey @eduardo-fp-romao 👋 I think you're confusing a few things here. The
pipeline
is a top level config section which lets you run the sequence of processors underpipeline.processors
in parallel against batches of messages emitted by your input(s) as dictated bypipeline.threads
. If you don't configure any batching, then individual messages are considered batches of size 1. You can, additionally, use aparallel
processor in yourpipeline.processors
which can have one or more child processors under theprocessors
field and it also has acap
field which lets you control how many individual messages from a batch it should process in parallel.In your case, if the
employee_reporter
processor …