In this post we will take a look at IP FRR and Micro-loops. If the reader already doesn’t have some kind of basic familiarity with IP FRR and Micro-loops, then I would highly recommend the reader go through below post series by Russ as he introduces various concepts in a very clear way. This post will be building upon many concepts introduced in the below posts.
Why do I care about MicroLoops?
As you may already know (otherwise look at https://packetpushers.net/microloop/ ) that Link State protocols are susceptible to transient loops for a brief period of time due to the independent decisions made by nodes on the network. Duration of the transient loop depends on the relative time to update the FIBs.
Now, whatever the time transient loop exists, it causes some collateral damage. A looping packet may amplify traffic and consumes bandwidth until its TTL expires or it escapes as a result of FIB convergence. This can transiently cause congestion even on the well provisioned link by increasing the traffic. This congestion reduces the bandwidth for other traffic (which wouldn’t have been effected otherwise) and causes delay and congestive packet loss on the links. Duration of the delay is equal to the duration of the micro-loop.
In the below fig.1, When the link between R3-R1 fails, then R3 will send a link state update packet, telling everyone that R3-R1 link is failed. Then it will compute the next best path which is towards R4 and update the FIB accordingly. The new Best path will be R5->R6->R4->R2->R1 but the likelihood of R5 updating its FIB will be after R3 , let’s say at interval T=0 R5 will send the packets to R3 which will result in a forwarding loop R5->R3->R5->R3 till R5 updates its FIB pointing towards R6.How ever it’s possible that R6 updates its FIB after R5 which means at interval T=1, a forwarding loop will appear between R5->R6->R6. Forwarding loop will work towards its way till R1.
Micro-loops can form in any part of the network, not just close to where the failure occurred, but can also happen upstream to failure (We will look at an example later in the post). Duration of the micro-loops are proportional to the time taken to propagate the topology change through the network and time taken by each router to calculate the new shortest path and update their FIBs.
Now normally you may not care about Micro-Loops and the collateral damage caused by them, but the people running real time applications like VOIP or IPTV or Financial industry folks may care.
So how do we solve this problem? One option is to speed up the whole convergence process to almost zero, but that’s not happening due to the fundamental limits of the speed of light and memory update latency. Another way to solve this is by using MPLS FRR or IP FRR. We have already seen various MPLS FRR mechanisms during my earlier post MPLS TE Design part-3 https://packetpushers.net/mpls-te-design-part-3/. We will focus on IP FRR in this post.
Intro to LFA
Traditionally MPLS FRR was the only option available to achieve sub-50ms recovery time. You might have already got a hint from my previous posts on MPLS TE design that deploying MPLS FRR isn’t that straightforward. IP FRR provide similar fast recovery methods in LDP based networks, but a lot simpler to implement from a complexity perspective.
Idea behind LFA is similar to MPLS FRR i.e. it pre-installs the backup next-hop into the forwarding plane. LFA’s doesn’t introduce any protocol extensions and can be implemented on a per router basis, which makes it a very attractive option.
So how does it work?
Normally link state IGPs only calculate the best path or multiple equal cost best paths for a destination. Equal cost multiple paths (ECMPs) are a valid FRR mechanism but it’s not possible in every topology. Expanding on the case where ECMPs doesn’t exists, other paths can be used as backup paths as long as they don’t cause a forwarding loop.
To avoid forwarding loops, a router needs to run some additional calculations to verify that a candidate backup route doesn’t create a forwarding loop. A route that doesn’t cause forwarding loop is called as Loop Free Alternate path. The router calculates these loop free alternate paths in advance and program them in the FIB.
RFC 5286 offers a method for calculating LFA’s based on route inequalities.
Inequality #1 (Link Protection)
Distance_opt(N, D) < Distance_opt(N, S) + Distance_opt(S, D)
Distance_opt is used to indicate the shortest distance from X and Y.
S is used to indicate the calculating router.
N is a neighbor of S.
D is the destination under consideration.
Path is loop-free because N’s best path is not through local router.” Traffic sent to backup next hop is not sent back to S.
In the below Fig.2, R3 (N) can be a backup for R1 (S) because it satisfies Inequality #1 condition.
2 < (1 + 2)
In fig. 3, if we change the metrics, then it doesn’t satisfy the equation plus we can clearly see that R3’s path goes back through R1 (S) so there is a forwarding loop.
3 < 3
Inequality#2 (Downstream path Criterion)
D(N,D) < D(S,D)
Okay, so this is a more restrictive condition than Inequality#1 condition and help avoiding micro loops in the extensive failure situations like Node failure.
In the Fig.4, based on Inequality#1 condition, R1’s LFA will be R2 as it satisfies the equation 1
15 < (7 + 16)
And R2 can pick R1 as its LFA because it satisfies Inequality#1 condition.
16 < (7+ 15)
This is okay if a link failure happens between R2-R3 or R1-R3.But if a Node failure (R3) happens then we will have a micro loop between R1 and R2 as they both will forward to each other. That’s where comes inequality#2 comes into the picture.
In Fig.5, based on Inequality#2 condition, R1 LFA will be R2 because it satisfies downstream path criterion
15 < 16
But R1 doesn’t satisfy as an LFA for R2 because 16 <15 isn’t true. So R2 will choose R4 as its LFA (12<15).
This condition helps in avoiding any kind of micro loops, but since it’s more restrictive, it can also reduce the LFA coverage in a given network.
Inequality#3 (Node Protection)
D(N,D) < D(N,E) + D(E,D)
“N’s path to D must not go through E.” “The distance from the node N to the prefix via the primary next-hop is strictly greater than the optimum distance from the node N to the prefix. “
This condition tries to provide Node Protection. Basically it’s trying to ensure that N’s path doesn’t cross through E where E is the primary path for S. If you look at fig. 6, N as an LFA for R1 (S) provides Node protection as it satisfies the inequality 3 i.e. (2 < 5+1).
Note: Juniper doesn’t use inequalities mentioned in RFC5286 for calculating LFA paths and instead uses the concept of Track-list to find LFA paths.
Per-Link and Per-Prefix LFA
Per-Link: Per link LFA just looks at the primary link and based on inequality condition#1 it determines if there is a backup path to that link. This backup path will carry all the destinations.Per link LFA is less CPU intensive compared to the Per Prefix as rSPF is run per neighbor and it will also try to provide Node protection if possible but no guarantees :).
Per-Link LFA’s are bad from the capacity planning perspective. Let’s assume that we have Per-Link LFA in the below fig. 7 and link R1-R4 is acting as a backup for R1-R3. So when R1-R3 link fails, then all the prefixes destined to R5 and R6 will fall back on R1-R4 and R4-R3 link which could create congestion on these links.
Depending on the topology, some times Per-Link LFA isn’t possible. In the below fig. 8, you can see that inequality condition #1 isn’t met for Prefix 20.x.x.x at R2 and for Prefix 10.x.x.x isn’t met at R6. If the link between R1-R4 goes down, then if R1 chooses R2 as a backup LFA then R2 will send the packet back to R1 for prefix 20.x.x.x. . If R1 chooses R6 as a backup LFA then R6 will send the packet back to R1 for prefix 10.x.x.x
Per-Prefix: Per Prefix determines the backup path for each prefix which means you can have different backup paths for different prefixes. In general coverage of Per-Prefix will be more than Per-Link but it’s also more CPU intensive.
If we look back at fig. 9 for Per Prefix, Now we will have LFA for both prefixes as inequality #1 is met by R2 for 10.x.x.x. And R6 meets the inequality#1 for 20.x.x.x
But even Per-Prefix LFA won’t be possible in every topology. For instance Rings are the worst case of LFA’s.In the below Fig. 10 if R1-R2 link fails, then R1 can’t send back to R4 as R4’s best path toward prefix 10.x.x.x is R1. We will look in Remote LFA to fix this problem later.
LFA Support for LDP
So far we have talked about how we can calculate backup paths and pre-programmed in the FIB to achieve FRR. But what if the network is LDP switched? In that case a protecting Node has to also obtain a label for the FEC, which it will use to send on backup path. This can be achieved easily if LDP functions in liberal retention, downstream unsolicited label allocation mode.
Applicability of LFA
LFA’s provide better coverage with meshed networks and provide worst coverage for Ring topologies. NSP’s POPs are the sweet spot for LFA’s. LFA can act as a complimentary technology to RSVP-TE. For instance an NSP can use RSVP-TE mesh in the core and use LFA’s in the POP to provide FRR.
Let’s look at few sample POP scenarios.
In the below fig. 12, we have a triangle topology within the POP. Let’s analyze LFA coverage for Prefix-1 and Prefix-2 when the PE1-CR1 link fails. In the below topology CR2 can act as a valid LFA if PE1-CR1 link fails. In normal condition PE1 will have an ECMP path through CR1,CR2 for Prefix-1. CR2 will provide both link and failure node protection.
For Prefix-2 CR2 acts as a valid LFA as it meets both inequality#1 (3 < 11 + 12) and #3 (3 < 5+ 2)
Case: Square Topology
Now let’s look at the case of square topology like in below fig. 13, where C<A. In case of PE1-CR1 failure, PE2 acts as a valid LFA and provides link and node protection for Prefix-2 as it meets inequality condition #1 (13 < 10 + 12) and #3 (13 < 15 + 2). But the problem is for the downstream traffic from CR1 towards PE1, there is no valid LFA
Now if make C>A then CR1 will have CR2 as a valid LFA for downstream traffic towards PE1.
RFC 6571 https://tools.ietf.org/html/rfc6571 presents a good analysis for various topologies. Please go through this for more details.
Continued in Part 2