Behaviour of ScaledJob minReplicaCount
field
#4885
Unanswered
LewisJackson1
asked this question in
General
Replies: 2 comments 6 replies
-
@tomkerkhove @zroubalik , please share your thoughts |
Beta Was this translation helpful? Give feedback.
5 replies
-
I stumbled onto this thread while trying to wrap my head around using keda and I feel like I'm now just more confused :-) If you're using a scaled job, the intent is for the job to pull a message, process it and terminate. If you start up the so-called warm instances, won't they just hit the time out more often than not before a message lands on the queue and terminate? |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
We've had a bit of discussion in the comments of this issue #4554 (comment) concerning the behaviour of the
minReplicaCount
of the ScaledJob object. I was asked to open a discussion here to get more opinions on the matter.Just to recap, the behaviour currently is defined as:
In my use case, we are considering migrating a ScaledObject to a ScaledJob. In doing so I expected the
minReplicaCount
of ScaledJob to match the behaviour of a ScaledObject. So to use the example above:The consequences of scaling out too many Jobs in our use case is that we'd incur minimum charges for GPU Nodes. We would like to have two Jobs kept warm without the overprovisioning behaviour that is currently intended. I can understand that this feature may be useful for some users, but I'm not sure that it's common enough to be the default.
The discussion in the issue above is worth reading and all opinions are welcome! I can think of a couple of ways to work around this behaviour to fit our use case, but it'd have been great to have it working out of the box.
Beta Was this translation helpful? Give feedback.
All reactions