You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The Scale Up phase can fail with 'NodeCreationFailure' and mentions 4 possible causes, however there is a 5th which is worthy of a mention.
Each node must bootstrap within 15 minutes
If any node takes more than 15-minutes to bootstrap and join the cluster it will cause the upgrade to time out. This is the total runtime for bootstrapping a new node measured from when a new node is required to when it joins the cluster.
It does currently mention UserData as a possible cause but in the case we were investigating UserData was not broken, The node did join the cluster and was healthy, it is just that a UserData script nudged node creation beyond the 15-minute window.
An alternative would be to extend the UserData root cause to include runtime but the 15-minute windows is important and as such should be a separate item.
FastLaunch has been proposed as a possible solution, other options included moving items around within the userdata. Specifically anything which do not require a reboot and will not interfere with the kubelet could be moved after the script which connects the mode to the cluster.
However improving UserData performance a bigger discussion and could bloat the document. A suitable link to an existing document could work though.
The Scale Up phase can fail with 'NodeCreationFailure' and mentions 4 possible causes, however there is a 5th which is worthy of a mention.
It does currently mention UserData as a possible cause but in the case we were investigating UserData was not broken, The node did join the cluster and was healthy, it is just that a UserData script nudged node creation beyond the 15-minute window.
https://github.com/awsdocs/amazon-eks-user-guide/blob/mainline/latest/ug/nodes/managed-node-update-behavior.adoc
For reference: [Case 173581560000418] Windows EKS Nodegroup update is not working
The text was updated successfully, but these errors were encountered: