In a traditional MPLS VPN cloud, it is always understood
that the core is BGP free and doesn’t hold any VPN specific information. Any
core MPLS router will only have reachability information about the PE and P
routers within the same domain.
So how does a traceroute triggered from a CE list all the
nodes within MPLS core?.
In the above MPLS VPN topology, CE1 and CE2 belong to a VRF
with PE1 and PE2 as Provider Edge nodes. P1, P2 and P3 being core nodes don’t
have any reachability information about CE1 or CE2. But when a trace route is
triggered from CE1 to CE2 and if TTL propagation is not disabled on PE routers,
the trace output in CE1 will list all routers along the path within MPLS cloud.
CE1#traceroute 192.168.6.6 source 192.168.1.1 numeric
Type escape sequence to abort.
Tracing the route to 192.168.6.6
VRF info: (vrf in name/id, vrf out name/id)
1 192.168.12.2 0
msec 0 msec 0 msec
2 10.1.23.3 [MPLS:
Labels 19/22 Exp 0] 1 msec 1 msec 1 msec
3 10.1.34.4 [MPLS:
Labels 19/22 Exp 0] 1 msec 1 msec 1 msec
4 192.168.56.5 [MPLS:
Label 22 Exp 0] 0 msec 0 msec 0 msec
5 192.168.56.6 2
msec * 2 msec
CE1#
So it appears that the P routers, even though they don’t
have reachability information for the VPN prefixes still were able to send ICMP
error message (Remember, ICMP error message “TTL expired” plays a key role in
trace route) back to CE nodes.
So, How does it work?
The default behavior of any LSR on receiving a packet with
TTL=1 on top label will drop the packet and send ICMP error message to source
of the packet. But when the packet is VPN traffic (with more than 1 label), the
LSR will perform the below,
- Buffer the label stack from incoming packet (the packet received with TTL=1)
- Generate ICMP error message with source as its own address and destination as source address from received packet.
- Append all labels from bottom of label stack (that was buffered earlier in step 1) with TTL=255 except the top one.
- Get the top label from buffered label stack and perform local LFIB lookup to get the label to swap and the associated next hop.
- Append the new label to the top of stack with TTL=255 and send across.
With this approach, the ICMP error message will traverse
from transit LSR to egress LSR and then back to ingress LSR to actual source in
VRF.
Below is a simple example explained for more clarity.
In the above topology, when a trace is performed from CE1 (192.168.1.1) to CE2 (192.168.6.6), the
first packet with TTL=1 will reach ingress PE which drops the packets and send
ICMP error message directly.
The second with TTL=2 will reach Ingress PE that pushes
<19><22>. While performing the same, the TTL of these labels will
be set to 1. 22>19>
P1 on receiving it will drop the packet, generate ICMP reply
message with destination as 192.168.1.1. The reply packet will be using the
same label stack for forwarding. It pushes the VPN label as 22, swap the
transport label as per the local forwarding table and send towards remote PE
router.
PE2 on receiving it will perform an IP lookup in the VRF
table and forward back to core towards 192.168.1.1. The same procedure will be continued
till it reaches 192.168.6.6.
No comments:
Post a Comment