Skip to content

feat: fallback for cumulus with no vlan overlap support#1295

Open
Frostman wants to merge 1 commit intomasterfrom
dev/frostman/dhcp-cumulus
Open

feat: fallback for cumulus with no vlan overlap support#1295
Frostman wants to merge 1 commit intomasterfrom
dev/frostman/dhcp-cumulus

Conversation

@Frostman
Copy link
Member

@Frostman Frostman commented Feb 7, 2026

Due to lack of the VRF marking on DHCP relayed packets we can't reliable distinguish between VPCs with overlapping VLANs.

@Frostman Frostman requested a review from a team as a code owner February 7, 2026 17:15
@Frostman Frostman added ci:-vlab Disable VLAB tests ci:-upgrade Disable VLAB upgrade tests labels Feb 7, 2026
@Frostman Frostman marked this pull request as draft February 7, 2026 17:15
@github-actions
Copy link

github-actions bot commented Feb 7, 2026

🚀 Temp artifacts published: v0-b30be88ed 🚀

1 similar comment
@github-actions
Copy link

github-actions bot commented Feb 7, 2026

🚀 Temp artifacts published: v0-b30be88ed 🚀

@Frostman Frostman removed the ci:-vlab Disable VLAB tests label Feb 9, 2026
@Frostman Frostman force-pushed the dev/frostman/dhcp-cumulus branch from b30be88 to b5da3de Compare February 9, 2026 17:47
@github-actions
Copy link

github-actions bot commented Feb 9, 2026

🚀 Temp artifacts published: v0-b5da3defd 🚀

@Frostman Frostman force-pushed the dev/frostman/dhcp-cumulus branch from b5da3de to 2dc4610 Compare February 9, 2026 18:46
@github-actions
Copy link

github-actions bot commented Feb 9, 2026

🚀 Temp artifacts published: v0-2dc461090 🚀

@Frostman Frostman force-pushed the dev/frostman/dhcp-cumulus branch from 2dc4610 to 5372367 Compare February 9, 2026 20:38
@github-actions
Copy link

github-actions bot commented Feb 9, 2026

🚀 Temp artifacts published: v0-5372367fe 🚀

Due to lack of the VRF marking on DHCP relayed packets we can't reliable
distinguish between VPCs with overlapping VLANs.

Signed-off-by: Sergei Lukianov <me@slukjanov.name>
@Frostman Frostman removed the ci:-upgrade Disable VLAB upgrade tests label Feb 10, 2026
@Frostman Frostman force-pushed the dev/frostman/dhcp-cumulus branch from 5372367 to 95ffd90 Compare February 10, 2026 05:14
@Frostman Frostman marked this pull request as ready for review February 10, 2026 05:15
@github-actions
Copy link

🚀 Temp artifacts published: v0-95ffd900b 🚀

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements a fallback mechanism for Cumulus DHCP relay agents that don't support VRF marking on relayed packets. Since Cumulus doesn't include VRF information in DHCP relay packets, the system cannot reliably distinguish between VPCs with overlapping VLAN IDs. The solution normalizes VRF and CircuitID values to lowercase for case-insensitive matching and adds special handling for Cumulus relays.

Changes:

  • Normalized VRF and CircuitID format from PascalCase (e.g., "VrfVvpc-1", "Vlan1000") to lowercase (e.g., "vpc-1", "vlan1000")
  • Added Cumulus-specific fallback logic that matches DHCP requests by CircuitID only when RemoteID starts with "cumulus:"
  • Enhanced logging to include RemoteID information for better debugging of relay agent behavior

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
pkg/dhcp/ipam.go Added lowercase normalization to subnetKey() function and unused vlanPrefix constant
pkg/dhcp/handlers.go Added RemoteID extraction, Cumulus fallback logic with duplicate detection, VRF prefix trimming, and lowercase normalization
pkg/ctrl/vpc_ctrl.go Changed VRF and CircuitID generation from PascalCase to lowercase format

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


const (
defaultVRF = "default"
vlanPrefix = "vlan"
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The constant vlanPrefix is defined here but never used in this file or elsewhere in the codebase. It appears to have been introduced in this PR but not actually utilized. Consider removing this unused constant or adding a TODO comment if it's intended for future use.

Suggested change
vlanPrefix = "vlan"

Copilot uses AI. Check for mistakes.
@@ -178,7 +178,7 @@ func (r *VPCReconciler) updateDHCPSubnets(ctx context.Context, vpc *vpcapi.VPC)
vrf := ""
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment still references the old format "VrfVvpc-1" but the code has been changed to use lowercase VPC names without the "VrfV" prefix. This comment should be updated to reflect the new format (e.g., "vpc-1" instead of "VrfVvpc-1") to match the actual implementation.

Suggested change
vrf := ""
vrf := ""
// For VNI-based VPCs, use the lowercase VPC name as the VRF name (e.g., "vpc-1");
// for L3Flat VPCs, use the "default" VRF.

Copilot uses AI. Check for mistakes.
LeaseTimeSeconds: leaseTime,
VRF: vrf,
CircuitID: fmt.Sprintf("Vlan%d", subnet.VLAN), // TODO move to utils
CircuitID: fmt.Sprintf("vlan%d", subnet.VLAN), // TODO move to utils
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment still references the old format "Vlan1000" but the code has been changed to use lowercase "vlan" prefix. This comment should be updated to reflect the new format (e.g., "vlan1000" instead of "Vlan1000") to match the actual implementation in line 205.

Suggested change
CircuitID: fmt.Sprintf("vlan%d", subnet.VLAN), // TODO move to utils
CircuitID: fmt.Sprintf("vlan%d", subnet.VLAN), // TODO move to utils (format: "vlan1000")

Copilot uses AI. Check for mistakes.
Comment on lines +100 to +117
// Cumulus doesn't support vrf-select so fallback to just checking the circuit ID (vlan interface name)
if strings.HasPrefix(remoteID, cumulusRemoteIDPrefix) && circuitID != "" {
for _, some := range s.subnets {
if some.Spec.CircuitID == circuitID {
if ok && subnet != nil {
slog.Warn("Duplicate subnet found for Cumulus dhcp-relay, ignoring request",
append([]any{"subnet", some.Name, "other", subnet.Name}, reqSummary(req, vrf, circuitID, remoteID)...)...)

subnet = nil
ok = false

break
}

subnet = some
ok = true
}
}
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new Cumulus fallback logic that matches subnets by CircuitID only (when RemoteID has "cumulus:" prefix) lacks test coverage. Given that the codebase has comprehensive test coverage for the DHCP package (as seen in ipam_test.go), this new behavior should be tested to ensure it correctly handles: 1) successful CircuitID-only matching, 2) duplicate CircuitID detection and warning, and 3) proper fallback to standard VRF+CircuitID matching for non-Cumulus relays.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments