abortScaleDownDelaySeconds is not working as expected #1841

mksha · 2022-02-03T11:49:49Z

Summary

What happened/what you expected to happen?
when we set abortScaleDownDelaySeconds:0 then it should not delete the canary pod but its deleting it immediately.

Diagnostics

What version of Argo Rollouts are you running?
1.1.1

# Paste the logs from the rollout controller

# Logs for the entire controller:
kubectl logs -n argo-rollouts deployment/argo-rollouts

# Logs for a specific rollout:
kubectl logs -n argo-rollouts deployment/argo-rollouts | grep rollout=<ROLLOUTNAME>

Message from the maintainers:

Impacted by this bug? Give it a 👍. We prioritize the issues with the most 👍.

The text was updated successfully, but these errors were encountered:

jessesuen · 2022-02-03T17:32:24Z

@mksha this option is ignored if you are doing a basic canary (not using a traffic router). Can you confirm you are using a traffic router? Which one are you using?

mksha · 2022-02-04T06:22:46Z

@jessesuen I am using the canary with traffic routing.
I am using istio with destination rule

huikang · 2022-02-04T18:07:40Z

I can reproduce the issue and will work on a fix. Thanks, @mksha

huikang · 2022-02-04T19:19:50Z

@mksha , after second thought, I think this is an expected behavior if you don't use setCanaryScale in a step, e.g.,

      - setCanaryScale:
          replicas: 3

Otherwise, the code will set the new replica to nil:

argo-rollouts/utils/replicaset/canary.go

Lines 481 to 503 in a78640a

    
           if rollout.Status.Abort { 
        
           	if abortDelay, _ := defaults.GetAbortScaleDownDelaySecondsOrDefault(rollout); abortDelay != nil { 
        
           		// If rollout is aborted do not use the set canary scale, *unless* the user explicitly 
        
           		// indicated to leave the canary scaled up (abortScaleDownDelaySeconds: 0). 
        
           		return nil 
        
           	} 
        
           } 
        
           currentStep, currentStepIndex := GetCurrentCanaryStep(rollout) 
        
           if currentStep == nil { 
        
           	// setCanaryScale feature is unused 
        
           	return nil 
        
           } 
        
           for i := *currentStepIndex; i >= 0; i-- { 
        
           	step := rollout.Spec.Strategy.Canary.Steps[i] 
        
           	if step.SetCanaryScale == nil { 
        
           		continue 
        
           	} 
        
           	if step.SetCanaryScale.MatchTrafficWeight { 
        
           		return nil 
        
           	} 
        
           	return step.SetCanaryScale 
        
           } 
        
           return nil

jessesuen · 2022-02-04T20:43:33Z

@mksha could you please provide more details including rollout spec, and kubernetes rollout events when this happens? I think we don't have enough information to go off of currently.

mksha · 2022-02-06T16:17:58Z

I am using the same spec shared at #1838.

mksha · 2022-02-06T16:18:13Z

@jessesuen

mksha · 2022-02-07T12:03:09Z

@huikang

You are correct, I tried that and it worked, so one that we need to do is, make sure document is updated so we know how to use it with what.
Still there is a weird behaviour, if we use the
steps:
- setCanaryScale:
weight: 100
- some-analysis-template
- setCanaryScale:
matchTrafficWeight: true
- setWeight: 10
- pause: {}
in that case, when we abort after the pause, it create a new pod using same canary replica set and destroy the existing one that fullfill the purpose of abosrtScaleDownDelaySeconds but it cause canary requests to fail, because old canary pod is deleted and new one is being created so no pod to server the canary requests.

So we need to make sure when we add

setCanaryScale:
matchTrafficWeight: true

it uses the same pod that was created by
- setCanaryScale:
weight: 100

mksha · 2022-02-15T04:44:52Z

@huikang @jessesuen any thoughts?

mksha · 2022-03-01T15:20:30Z

@jgwest @huikang any thoughts?

github-actions · 2023-02-20T02:41:35Z

This issue is stale because it has been open 60 days with no activity.

bishalthapa-t · 2024-09-16T15:03:56Z

@huikang @zachaller Could you please provide an update or an estimated timeline for addressing this bug?

Additionally, I have submitted a pull request to improve the documentation. You can review the changes here: PR #3835. Please take a look and let me know if it accurately addresses the scenario at hand.

mksha added the bug Something isn't working label Feb 3, 2022

huikang self-assigned this Feb 4, 2022

huikang added the workaround There's a workaround, might not be great, but exists label Feb 4, 2022

AndiDog mentioned this issue Apr 22, 2022

scaleDownDelaySeconds: 0 keeps old ReplicaSet running but should scale it down immediately #1992

Open

harikrongali added this to the v1.4 milestone Oct 20, 2022

zachaller removed this from the v1.4 milestone Dec 21, 2022

github-actions bot added the no-issue-activity label Feb 20, 2023

bishalthapa-t mentioned this issue Sep 17, 2024

docs: Improve Documentation for Scaledown on Aborted Rollout #3835

Open

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

abortScaleDownDelaySeconds is not working as expected #1841

abortScaleDownDelaySeconds is not working as expected #1841

mksha commented Feb 3, 2022

jessesuen commented Feb 3, 2022

mksha commented Feb 4, 2022

huikang commented Feb 4, 2022

huikang commented Feb 4, 2022 •

edited

Loading

jessesuen commented Feb 4, 2022

mksha commented Feb 6, 2022

mksha commented Feb 6, 2022

mksha commented Feb 7, 2022

mksha commented Feb 15, 2022

mksha commented Mar 1, 2022

github-actions bot commented Feb 20, 2023

bishalthapa-t commented Sep 16, 2024 •

edited

Loading

abortScaleDownDelaySeconds is not working as expected #1841

abortScaleDownDelaySeconds is not working as expected #1841

Comments

mksha commented Feb 3, 2022

Summary

Diagnostics

jessesuen commented Feb 3, 2022

mksha commented Feb 4, 2022

huikang commented Feb 4, 2022

huikang commented Feb 4, 2022 • edited Loading

jessesuen commented Feb 4, 2022

mksha commented Feb 6, 2022

mksha commented Feb 6, 2022

mksha commented Feb 7, 2022

mksha commented Feb 15, 2022

mksha commented Mar 1, 2022

github-actions bot commented Feb 20, 2023

bishalthapa-t commented Sep 16, 2024 • edited Loading

huikang commented Feb 4, 2022 •

edited

Loading

bishalthapa-t commented Sep 16, 2024 •

edited

Loading