Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MPLS TTL behaviour might be wrong #109

Open
Yi-Tseng opened this issue Oct 12, 2020 · 5 comments
Open

MPLS TTL behaviour might be wrong #109

Yi-Tseng opened this issue Oct 12, 2020 · 5 comments
Assignees

Comments

@Yi-Tseng
Copy link
Collaborator

Currently, we set the MPLS TTL value to a default one(64), however, we should copy the TTL from the IP header.
We also need to set the TTL back to the IP header when we pop the MPLS label.

Not much detail in the original RFC showing how to handle TTL with IP packet
https://tools.ietf.org/html/rfc3031#section-3.23

But there are some rules in RFC2032 (MPLS Label Stack Encoding)
https://tools.ietf.org/html/rfc3032#section-2.4.3

2.4.3. IP-dependent rules

   We define the "IP TTL" field to be the value of the IPv4 TTL field,
   or the value of the IPv6 Hop Limit field, whichever is applicable.

   When an IP packet is first labeled, the TTL field of the label stack
   entry MUST BE set to the value of the IP TTL field.  (If the IP TTL
   field needs to be decremented, as part of the IP processing, it is
   assumed that this has already been done.)

   When a label is popped, and the resulting label stack is empty, then
   the value of the IP TTL field SHOULD BE replaced with the outgoing
   TTL value, as defined above.  In IPv4 this also requires modification
   of the IP header checksum.

   It is recognized that there may be situations where a network
   administration prefers to decrement the IPv4 TTL by one as it
   traverses an MPLS domain, instead of decrementing the IPv4 TTL by the
   number of LSP hops within the domain.

Also, there are some explanations on these websites:
https://www.ciscopress.com/articles/article.asp?p=680824&seqNum=4
http://wiki.kemot-net.com/mpls-ttl-behavior

Which says we need to set the TTL value to TTL-1 from the previous header (push/swap/pop)

@ccascone
Copy link
Member

ccascone commented Oct 22, 2020

Following the RFC means having the final IP TTL decremented by the number of hops inside the fabric (e.g, ttl = ttl - 2 for packets going from one leaf to the other in a 2x2).

But considering that we use MPLS only inside the fabric (i.e., we don't peer with external MPLS routers) and that Trellis abstracts the whole fabric as one big IP router, do we need to follow the RFC? Or should we just make sure that the IP TTL is decremented by one independently of the number of hops inside the fabric?
cc @charlesmcchan @pierventre

@charlesmcchan
Copy link
Collaborator

charlesmcchan commented Oct 23, 2020

Shouldn't it be -3 instead of -2?
Our case should be the same as figure 3-4 in this page

I believe SR does both COPY_OUT and COPY_IN at the first and last hop respectively.

@ccascone
Copy link
Member

Yes, it should be -3...

I just realized that since we do penultimate hop popping (i.e., spine pops MPLS), without copying the TTL between MPLS and IP, we cannot prevent loops inside the fabric...

I still think that decrementing the IP TTL by the number of hops inside the fabric is wrong. IMO the fabric should behave like one big router between the access devices and the Internet. The fact that we use MPLS tunnels internally is an implementation detail, and the IP TTL should not be affected by the number of switches inside the fabric. Instead, the IP TTL should be frozen when inside the tunnel.

However, using penultimate hop popping doesn't leave us any other choice if we want protection against loops. We should change segmentrouting to support ultimate hop popping (i.e., dest leaf pops MPLS) to be able to detect tunnels inside the fabric.

@charlesmcchan
Copy link
Collaborator

What SR does today is completely legit as described in RFC3443 section 3.1.

I found it hard to justify the benefit of making such changes, taking the amount of work that needs to be done into account.
We need a stronger reason to prioritize this.

@ccascone
Copy link
Member

I agree that we don't have strong reasons to do this change. I just wanted to voice my concern.

I will make the change to fix the TTL behavior such that we comply with RFC3443 section 3.1, but most importantly with segmentrouting flow objectives.

@ccascone ccascone self-assigned this Oct 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants