Error Logs When ENI Allocation Fails Due to Insufficient Subnet IPs #3172

YeongJJo · 2025-01-08T05:59:17Z

What would you like to be added:

Although it is possible to identify this issue through pod events, I believe it would also be efficient to detect errors through logs in pods like aws-node.

For example:
Error: ENI allocation failed for worker node XXX - all IPs are allocated and the subnet has insufficient available IPs.

.
.
.
Why is this needed:

Description:
When the number of IPs in the subnet where a worker node is located is less than the maximum number of IPs that an ENI can use, the ENI is not allocated and there are no error logs.

Detailed Information:

In the test environment, there are two worker nodes located in different subnets (for convenience, let's call them A and B).
EC2 Type for Nodes A and B: Both node groups use an EC2 instance type that allows up to 30 IP addresses per ENI.
Resource Availability: Both nodes A and B have ample CPU and memory resources available.
Subnet for Node A: The subnet where node A is located has hundreds of available IPs, providing plenty of room. Therefore, additional ENIs have been allocated to node A, resulting in 2 ENIs and a total of 60 IPs (private IPs + secondary IPs) being used.

Now, for node B

Pending Pod Requests: There are still some pod creation requests pending.
ENI Usage on Node B: The primary ENI on node B has all 30 IPs in use.
Subnet for Node B: The subnet where node B is located has only about 20 available IPs left.

Observed Behavior

Since both nodes have ample CPU and memory resources, node scaling does not occur.
In this case, no additional ENIs are allocated to node B.
Pod creation continuously fails with the following error event:
plugin type="aws-cni" name="aws-cni" failed (add): add cmd: failed to assign an IP address to container

Additional Information:

However, there are no error or warning logs in the aws-node pod.
Although it is possible to identify this issue through pod events, I believe it would also be efficient to detect errors through logs in pods like aws-node.
For example:
Error: The ENI of worker node XXX has all IPs allocated and the subnet has insufficient available IPs.

The text was updated successfully, but these errors were encountered:

dshehbaj · 2025-01-16T20:50:11Z

Hi @YeongJJo

I noticed that we do print out error log when ENI allocation fails or there are not enough IPv4 addresses/prefixes available.

amazon-vpc-cni-k8s/pkg/ipamd/datastore/data_store.go

Line 734 in 94c4a15

ds.log.Errorf("DataStore has no available IP/Prefix addresses")

amazon-vpc-cni-k8s/pkg/ipamd/ipamd.go

Line 876 in 94c4a15

    
           log.Errorf("Unable to attach IPs/Prefixes for the ENI, subnet doesn't seem to have enough IPs/Prefixes. Consider using new subnet or carve a reserved range using create-subnet-cidr-reservation")

Just want to make sure this is what you mean by logs, and if you are able to see them being printed when you run into the scenario you described above.

yash97 · 2025-01-17T19:08:37Z

To add more, logs of aws-node are present in this directory /var/log/aws-routed-eni. kubectl logs won't show all logs. This directory is hostVolume so you access it from worker node.

YeongJJo added enhancement feature request labels Jan 8, 2025

dshehbaj self-assigned this Jan 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error Logs When ENI Allocation Fails Due to Insufficient Subnet IPs #3172

Error Logs When ENI Allocation Fails Due to Insufficient Subnet IPs #3172

YeongJJo commented Jan 8, 2025

dshehbaj commented Jan 16, 2025

yash97 commented Jan 17, 2025

Error Logs When ENI Allocation Fails Due to Insufficient Subnet IPs #3172

Error Logs When ENI Allocation Fails Due to Insufficient Subnet IPs #3172

Comments

YeongJJo commented Jan 8, 2025

dshehbaj commented Jan 16, 2025

yash97 commented Jan 17, 2025