-
Notifications
You must be signed in to change notification settings - Fork 197
[BAHIR-295] Added backpressure & ratelimit support #101
base: master
Are you sure you want to change the base?
Conversation
068233b
to
526a8b7
Compare
@iammehrabalam thanks for your contribution |
@iammehrabalam What is the backward compatibility story for this change? Also, should the sample be using the new capabilities to demonstrate the new functionality? |
The behaviour will be exactly same what was earlier if below spark streaming config is not set.
So it means backward compatible. Added a test case which demonstrate rate and batch size. |
@lresende @eskabetxe reminder |
The backpressure implementation isn't working as expected. My understanding is that the backpressure mechanism will control the input rate but never exceed the Context - I created a Spark Scala app with 900 receivers, |
Based on https://spark.apache.org/docs/latest/streaming-custom-receivers.html#receiver-reliability, the rate control mechanism will have to be implemented by the receiver (if reliable). I do not see any logic that caps the input rates to the |
Just stumbled upon this PR. For anyone interested, my guess is that the correct implementation should use Spark Streaming's BlockGenerator class. It would give the whole process |
@LeonardMeyer you are right but the rate limit will only be applied when single data is written into the store (https://github.com/apache/spark/blob/595ad30e6259f7e4e4252dfee7704b73fd4760f7/streaming/src/main/scala/org/apache/spark/streaming/receiver/Receiver.scala#L118). In case of writing iterator (i.e block) directly rate limit will not be applied by default. In Pubsub Receiver, the iterator store method is called where we added rate limit (i.e the same rate limit is generated based on backpressure ) |
@datasherlock Ideally it should work. If possible share spark configuration so I can help. |
This change was suggested two years ago. Is there any plan to push it through? |
No description provided.