Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Failed to upgrade Kubernetes versions when deploying node pools separately in Bicep #4765

Open
shubham1172 opened this issue Jan 23, 2025 · 4 comments
Assignees

Comments

@shubham1172
Copy link
Member

shubham1172 commented Jan 23, 2025

Describe the bug
Using Bicep to deploy and upgrade AKS clusters - when trying to upgrade the Kubernetes cluster (control plane and node pools), it throws this error:

{"status":"Failed","error":{"code":"DeploymentFailed","target":"/subscriptions/.../aks-upgrade-rg/providers/Microsoft.Resources/deployments/aks","message":"At least one resource deployment operation failed. Please list deployment operations for details. Please see https://aka.ms/arm-deployment-operations for usage details.","details":[{"code":"BadRequest","target":"/subscriptions/.../aks-upgrade-rg/providers/Microsoft.ContainerService/managedClusters/testcluster-2","message":"{\r\n  \"code\": \"NotAllAgentPoolOrchestratorVersionSpecifiedAndUnchanged\",\r\n  \"details\": null,\r\n  \"message\": \"Using managed cluster api, all Agent pools' OrchestratorVersion must be all specified or all unspecified. If all specified, they must be stay unchanged or the same with control plane. For agent pool specific change, please use per agent pool operations: https://aka.ms/agent-pool-rest-api\",\r\n  \"subcode\": \"\"\r\n}"}]}}

To Reproduce
Here is the minimal, reproducible example.

  1. Create a bicep file with the following content:
aks.bicep
param kubernetesVersion string = '1.29.4'

var clusterName = 'testcluster'
var location = resourceGroup().location
var dnsPrefix = '${clusterName}-dns'
var agentVMSize = 'Standard_DS2_v2'
var vmAvailabilityZones = []

resource aks 'Microsoft.ContainerService/managedClusters@2024-02-01' = {
  name: clusterName
  location: location
  identity: {
    type: 'SystemAssigned'
  }
  sku: {
    name: 'Base'
    tier: 'Standard'
  }
  properties: {
    dnsPrefix: dnsPrefix
    autoUpgradeProfile: {
      upgradeChannel: 'patch'
    }
    kubernetesVersion: kubernetesVersion
    agentPoolProfiles: [
      {
        name: 'runtime'
        mode: 'System'
        type: 'VirtualMachineScaleSets'
        vmSize: agentVMSize
        osType: 'Linux'
        osSKU: 'AzureLinux'
        orchestratorVersion: kubernetesVersion
        count: 1
        availabilityZones: vmAvailabilityZones
        osDiskSizeGB: 0
        enableFIPS: true
      }
      {
        name: 'user'
        mode: 'User'
        type: 'VirtualMachineScaleSets'
        vmSize: agentVMSize
        osType: 'Linux'
        osSKU: 'AzureLinux'
        orchestratorVersion: kubernetesVersion
        count: 3
        availabilityZones: vmAvailabilityZones
        osDiskSizeGB: 0
        enableFIPS: true
      }
    ]
  }
}

resource extraNodePool 'Microsoft.ContainerService/managedClusters/agentPools@2024-03-02-preview' = {
  name: 'extra'
  parent: aks
  properties: {
    mode: 'User'
    type: 'VirtualMachineScaleSets'
      vmSize: agentVMSize
      osType: 'Linux'
      osSKU: 'AzureLinux'
      orchestratorVersion: kubernetesVersion
      count: 3
      availabilityZones: vmAvailabilityZones
      osDiskSizeGB: 0
      enableFIPS: true
  }
}
  1. Deploy the bicep az deployment group create --resource-group $RESOURCE_GROUP --subscription $SUBSCRIPTION_ID --template-file aks.bicep. Deployment suceeds at version 1.29.4.
  2. After a while, auto-upgrade kicks in and updates control plane and nodes to 1.29.11
    Image
  3. Now change the kubernetesVersion to 1.30.1 and deploy it again. This should trigger an update for both control plane and nodes. It fails!

Expected behavior
The upgrade should have gone through.

Additionally, if the bicep is different and contains the nodepool as a nested resource instead of a separate resource, it passes!

aks.bicep
param kubernetesVersion string = '1.31.1'

var clusterName = 'testcluster'
var location = resourceGroup().location
var dnsPrefix = '${clusterName}-dns'
var agentVMSize = 'Standard_DS2_v2'
var vmAvailabilityZones = []

resource aks 'Microsoft.ContainerService/managedClusters@2024-02-01' = {
  name: clusterName
  location: location
  identity: {
    type: 'SystemAssigned'
  }
  sku: {
    name: 'Base'
    tier: 'Standard'
  }
  properties: {
    dnsPrefix: dnsPrefix
    autoUpgradeProfile: {
      upgradeChannel: 'patch'
    }
    kubernetesVersion: kubernetesVersion
    agentPoolProfiles: [
      {
        name: 'runtime'
        mode: 'System'
        type: 'VirtualMachineScaleSets'
        vmSize: agentVMSize
        osType: 'Linux'
        osSKU: 'AzureLinux'
        orchestratorVersion: kubernetesVersion
        count: 1
        availabilityZones: vmAvailabilityZones
        osDiskSizeGB: 0
        enableFIPS: true
      }
      {
        name: 'user'
        mode: 'User'
        type: 'VirtualMachineScaleSets'
        vmSize: agentVMSize
        osType: 'Linux'
        osSKU: 'AzureLinux'
        orchestratorVersion: kubernetesVersion
        count: 3
        availabilityZones: vmAvailabilityZones
        osDiskSizeGB: 0
        enableFIPS: true
      }
      {
        name: 'extra'
        mode: 'User'
        type: 'VirtualMachineScaleSets'
        vmSize: agentVMSize
        osType: 'Linux'
        osSKU: 'AzureLinux'
        orchestratorVersion: kubernetesVersion
        count: 3
        availabilityZones: vmAvailabilityZones
        osDiskSizeGB: 0
        enableFIPS: true
      }
    ]
  }
}

According to the bicep docs there should not be any difference if the nodepool is within the parent resource or outside of it with a reference, but it results in failure to upgrade. https://learn.microsoft.com/en-us/azure/azure-resource-manager/bicep/child-resource-name-type

Screenshots
Error message from portal:
Image

Environment (please complete the following information):

  • CLI Version [e.g. 3.22] 2.67.0
  • Kubernetes version [e.g. 1.24.3] 1.30.2
  • CLI Extension version [e.g. 1.7.5] if applicable
  • Browser [e.g. chrome, safari] is applicable

Additional context
Add any other context about the problem here.

@shubham1172
Copy link
Member Author

shubham1172 commented Jan 23, 2025

FYI, when auto-upgrade is off, I still see the same error.

aks.bicep
param kubernetesVersion string = '1.30.2'

var clusterName = 'testcluster'
var location = resourceGroup().location
var dnsPrefix = '${clusterName}-dns'
var agentVMSize = 'Standard_DS2_v2'
var vmAvailabilityZones = []

resource aks 'Microsoft.ContainerService/managedClusters@2024-02-01' = {
  name: clusterName
  location: location
  identity: {
    type: 'SystemAssigned'
  }
  sku: {
    name: 'Base'
    tier: 'Standard'
  }
  properties: {
    dnsPrefix: dnsPrefix
    kubernetesVersion: kubernetesVersion
    agentPoolProfiles: [
      {
        name: 'runtime'
        mode: 'System'
        type: 'VirtualMachineScaleSets'
        vmSize: agentVMSize
        osType: 'Linux'
        osSKU: 'AzureLinux'
        orchestratorVersion: kubernetesVersion
        count: 1
        availabilityZones: vmAvailabilityZones
        osDiskSizeGB: 0
        enableFIPS: true
      }
    ]
  }
}

resource userNodePool 'Microsoft.ContainerService/managedClusters/agentPools@2024-03-02-preview' = {
  name: 'user'
  parent: aks
  properties: {
    mode: 'User'
    type: 'VirtualMachineScaleSets'
      vmSize: agentVMSize
      osType: 'Linux'
      osSKU: 'AzureLinux'
      orchestratorVersion: kubernetesVersion
      count: 3
      availabilityZones: vmAvailabilityZones
      osDiskSizeGB: 0
      enableFIPS: true
  }
}

@PixelRobots
Copy link
Collaborator

Copy link
Contributor

@kaarthis, @sdesai345 would you be able to assist?

@shubham1172
Copy link
Member Author

@PixelRobots I don't see an example of a Bicep that can both create and update clusters, do you have an example of that?

@kaarthis / @sdesai345 ping.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants