This is a continuation from Part 2
Why Fast Reroute?
Many NSP’s like ACME have traffic with tight SLAs. For instance below is an ITU delay recommendation for Voice.
|One Way Delay||Characterization of Quality|
|0-150ms||Acceptable for most applications|
|150-400ms||May impact some applications|
ITU G.114 delay recommendations
Having a fast recovery during a failure is an essential feature to maintain SLAs in a multi service network like ACME has.
One option for providing fast recovery following a link failure is to provide protection at layer 1. This is usually achieved by SONET APS in which there is a standby link i.e. ready to take over in the event of an active link failure. But this approach is expensive and only covers link failure.
MPLS FRR can provide similar guarantees for recovery and provide protection around link and node failures. One thing to keep in mind is that MPLS TE Fast Reroute is a temporary network recovery mechanism; the protected TE LSPs are quickly and locally rerouted onto backup tunnels using a local protection technique, but the path followed by the rerouted ﬂows might no longer be optimal. Once the head end is made aware about the failure, then it calculates an optimal path and creates a new LSP in a make-before-break fashion.
In some networks, there might be an interest in MPLS TE just for its fast recovery property.
An NSP running an overprovisioned network may not care about bandwidth optimization or QOS guarantees/ SLA considering there is a very less chance of congestion occurring at any situation. There are two strategies for deploying MPLS-TE when the sole objective of the operator is to use Fast Reroute:
- With a full mesh of unconstrained TE LSPs
- With one-hop unconstrained TE LSPs
Mechanisms for deploying FRR
Path protection supports the configuration of primary and secondary physical paths for an LSP to protect against link and node failures. This process is known as global repair. The primary path is the preferred path, while the secondary path is used as an alternative route only when the primary path fails. There are two types of secondary paths:
- Hot standby
- Cold standby.
A hot standby secondary path is precomputed and pre-established, while a cold standby secondary path is precomputed but not pre-established. There can be more than one secondary path associated with an LSP, with a configurable order of preference. This approach has the main advantage of specifying disparate paths across the backbone, should the network have such a physical configuration.
If a link or node in the primary path fails, the LSR immediately upstream of the outage notifies the ingress LSR of the failure with RSVP-TE. Upon receipt of the outage notification, the ingress LSR reroutes traffic from the failed primary path to the secondary path. The use of hot standby secondary paths improves recovery time by eliminating the call-setup delay that is required to establish a new physical path for the LSP. Because resources are reserved even when the primary path is active, hot standby secondary paths waste more resources than cold standby secondary paths. Nevertheless, the restoration time for hot standby secondary paths is much faster. When the primary path is reestablished, the ingress LSR automatically switches traffic from the secondary path back to the primary path.
In the Fig.16, Primary path is traversing the top part of the network and there is a secondary LSP to the same tail end router which could be Active or Standby. It’s important to have your secondary LSP traversing a diverse path otherwise both LSPs will fail in the event of a failure. Also in the case of Active standby LSP the resources will be double booked.Let’s say in our example BLUE LSP is 100Mbps then Active secondary RED LSP will book another 100Mbps between CR-2 — P1 and P5—CR-3.
The time taken between the switchovers from Primary to Standby LSP is derived from the time it takes for RSVP messages to reach the head end. It’s quite possible that the RSVP can get lost on the way back to the head end. In order to detect the tunnel failure faster, one option could be to run BFD over LSP which can decrease the failure detection time.
All in all you should look at path protection when
- A limited number of TE LSPs need to be protected in a large network. Path Protection is easy to provide when compared to other FRR schemes.
- You want better control over the fate of traffic in the event of failure (You can control which path the secondary LSP should take).
The disadvantages of path protection are
- Double booking of resources which could be a road blocker especially in a full-mesh TE networks.
- Path protection will have slower recovery time compared to Facility or One to One backups, which might be an issue to protect sensitive traffic like Voice. Reasoning behind this is that failure notification needs to be received by the head-end LSP before switching the traffic to backup LSP.
One to One Backups (1:1)
In One to One backups each LSP that is backed up with the one-to-one technique, a detour LSP is established. In one-to-one backup, a single label (or single level label stack) is used throughout the detour. In the below Fig.17 there are three LSPs from R1 to R8, R5 and R10. R2 (the PLR) provides traffic protection by creating three detour LSPs. LSPs to R5 and R10 merges at R3 (the Merge Point) where the detour LSPs rejoins the main LSP.
Though it’s desirable to merge a detour LSP back to its main LSP whenever feasible to minimize the number of LSPs in the network it’s not mandatory. Since each detour LSP is dedicated to one LSP, all it needs to do is follow the shortest path to the egress node. For instance the detour LSP for R1-to-R8 doesn’t need to merge back to R3 as the shortest path R8 from R2 is through R2-R6-R7-R8. If the shortest path to the egress node intersects the path of main LSP, then detour LSP path merges to the main LSP path. For instance the LSP from R1-to-R5 and R1-to-R10. When a detour LSP intersects its main LSP at an MP with the same outgoing interface, it will be merged. The Merge Point is responsible for mapping both LSP inbound label and the detour LSP inbound label to the same outbound label/action.
Below is another example of one to one backup LSPs for the node failure case.
The key thing to keep in mind while deciding on One-to-One backup as FRR is the scalability factor as this will scale poorly in a large environment due to its 1-1 nature.
Note: Side note Juniper refers One-to-One backup as FRR in their documentation and Cisco refers to Facility backup as FRR. So this could cause confusion when a Cisco guy is talking to a Juniper guy. Also Cisco only supports Facility backup not One-to-One backup.
Facility Backup (N:1)
The facility backup creates a single bypass tunnel to back up a set of main LSPs traversing between the PLR and a common node downstream of the potential failure. This backup technique uses the MPLS label stack. In the Fig.19 below, R2 (the PLR) has constructed a bypass LSP that protects against the link failure of R2-R3. This single bypass LSP protects all the three LSPs crossing the R2-R3 link. By protecting an N number of LSPs crossing a link by a single LSP improves the scalability aspect.
When a failure occurs along a main LSP, the PLR uses label stacking to redirect traffic onto the appropriate bypass tunnel. The idea is very simple, R2 knows what label R3 is expecting as that’s what R3 told R2 in his RESV message. If somehow R2 can send the same label what R3 is expecting through another path in the event of failure, then R3 will perform the same swapping action. In order to achieve this PLR (R2) uses a two level label stack composed of the bypass tunnel label at the top of the stack and the main LSP label (of the MP i.e. R3 in our case) at the bottom of the stack as R2 redirects it into the bypass tunnel. R6 performs a swap function on the top label and send it to R7. When R7 gets the packet, it performs PHP and exposes the bottom label (which R3 was expecting it originally) and send it to R3. R3 receives the packet with the same label as it expected from R2 on the main LSP before the link fails.
Below fig. 20 illustrates node protection with facility backup, which is little more complicated than link protection. In the below example, when R3 node fails two bypass LSPs from R2 (PLR) are needed. One bypass LSP goes from R2 to R7 (next node of R3) for R1-to-R8 LSP. Other bypass LSP is needed to R4 (next node of R4) to protect R1-to-R5 and R1-to-R10 LSPs.
The idea here is the same like link failure i.e. In the event of failure, if somehow R2 sends the label R7 and R4 were expecting from R3 then they will perform the same action. But unlike previous where R2 knew what label R3 is expecting, R2 doesn’t know what label R7 and R4 are expecting as they only told that to R3 through RESV message. So if somehow we record all the labels as the RESV messages travels back from Egress to Ingress node, then a PLR node will know by looking at the RESV message on what the label Next to Next HOP is expecting.
The application of an extended RECORD_ROUTE object will allow R2 to learn this label when the “label recording desired” flag is included in the SESSION_ATTRIBUTE object. Thus, the LSRs along the path will insert into the RECORD_ROUTE object (RRO) the label value information and then their corresponding IP addressing information above it. The RRO is organized in a last-in-first-out (stack) format, where the most recent LSR that has written its route information as a sub object becomes the top-level entry. So in simple words, R2 can examine the RRO in the RESV message and learn about the incoming labels that are used by all downstream nodes for this LSP. Problem solved.
All in all you should look at deploying Facility backup (Juniper = Bypass LSP, Cisco = FRR) when
- Multivendor Backbone: when its multivendor environment (Juniper/Cisco). Bypass LSPs is the only common method supported by both Cisco and Junipers.
- Scalability: Bypass LSPs are more scalable compared to 1-1 backup due N:1 in nature. It’s a function of the number of network elements to protect.
- It can provide bandwidth, propagation delay, and jitter guarantees in the case of link/SRLG/node failure. In the case of facility backup, the required backup capacity can be drastically reduced thanks to the notion of bandwidth sharing between backup tunnels protecting independent resources.
In Juniper’s bypass LSP implementation, once you enable the node/link protection at the head end, backup LSPs is automatically formed. In case of Cisco’s bypass LSP implementation you have to use auto-tunnel backup feature to have node/link protection configured automatically. Setting up backup LSPs manually can be a tedious task in a large network.
If NSP like ACME which is using LDPoRSVP then make sure the MTU accounts for at least 4 labels which will be the case for Facility backup FRR in the event of a failure.
Scalability Analysis of Path Protection, Facility Backup and One to One Backup
Let’s evaluate the number of required backup tunnels with global path protection, Fast Reroute facility backup, and one-to-one. Formula to calculate the Backup Number of tunnels is used from “Network Recovery: Protection and Restoration of Optical, SONET-SDH, IP, and MPLS”. Assumptions are:
- D: network diameter (average number of hops between a head-end LSR and a tail-end LSR) .Lets just assume this number to be 5 in case of ACME.
- C: degree of connectivity (average number of neighbors)
- L: total number of links to be protected with Fast Reroute
- N: total number of nodes (LSRs)
- T: total number of protected TE LSPs in the MPLS network
- Bu: number of backup tunnels required
- K: number of class of recovery (In our case it would be two Class of Recovery, one for Voice and another for Data). K=2.
- S: average number of splits (In some cases where bandwidth protection is required and backup bandwidth is a very scarce resource, more than one backup tunnel per protected link/node may be required if a single backup tunnel with enough bandwidth cannot be found). In our case we would assume there is enough backup bandwidth available for both Voice and Data backup tunnels. So S=1 in our case.
- M: number of meshes in the network (e.g., There may be multiple meshes of TE LSPs in a network serving different purposes: one mesh for the voice trafﬁc and one mesh for the data trafﬁ c). In our case M=2, one for Voice and another for Data.
→ L < N x C (because some links may not be protected by Fast Reroute)
→ T = M x N x (N – 1) (assuming a full mesh TE deployment)
Let us now compute the total number of required backup tunnels Bu with global path protection, Fast Reroute one-to-one, and facility backup.
- Number of backup tunnels Bu with global path protection:
Bu = M x T = M x N x (N-1)
- Number of backup tunnels Bu with facility backup:
- For link protection, then Bu = L x K x S
- For the link and node protection: If both links and nodes are protected with Fast Reroute then:
Bu = (L x K x S + N x C x (C-1) )x K xS
But in our case we will modify the formula and just count Node protection, as Node protection will implicitly cover link protection.
So Bu = (N x C x (C-1)) x K x S
- Computation of Bu with local protection: one-to-one backup without merging
BU = M x N x (N-1) x D
Let’s now fill the variables of the formula according to ACME network
■ D (diameter) = 5 (Lets just assume this number to be 5 in case of ACME.)
■ C (degree of connectivity) = 4 (Assume 4 is avg. number of neighbors)
■ M (number of meshes) = 2 (one mesh for voice and one mesh for data trafﬁc)
■ K = 2 (two classes of recovery: one for voice with bandwidth protection and one for data without bandwidth protection)
■ S = 1 (In our case we would assume there is enough backup bandwidth available for both Voice and Data backup tunnels. So S=1 in our case.)
■ All links must be protected by Fast Reroute: L = N * C
N = 100 ( 50 Core locations with 2 Core Routers per location) .
Let us now compare Bu for global path protection, Fast Reroute one-to-one, and facility backup, using the previous formulas:
- Global path protection: Bu = M x T = M x N x (N -1) = 2 x 100 x 99 = 19800
- Local protection -facility backup (node protection): BU = (N x C x (C-1)) x K x S == ( 100 x 4 x 3) x 2 x 1 = 2400
- Local protection -facility backup (link protection) = Bu = L x K x S = 100 x 4 x 2 x1 = 800
- Local protection – Node+ Link protection = 2400+800 = 3200
- Local protection-one-to-one backup: Bu= M x N x (N-1) x D = 2 x 100 x 99 x 5 = 99000
Above results clearly show that both global path protection and Fast Reroute one-to-one backup scale poorly in large environments.
What is ACME doing?
In the case of ACME they have decided to choose bandwidth link protection for Voice Traffic and link protection Data Traffic TE mesh for following reasons
- Global Path Protection: Due its nature of slow recovery time compared to Facility and One to One backup, ACME decided not to deploy global path protection.
- One to One Backup: Due to poor scalability and ACME’s multivendor backbone were the two biggest reasons for not choosing one to one backup for FRR.
- Facility Protection: ACME did an analysis of their network and found that node failures were very rare in the backbone network but found link failures were very common. So they decided to enable Link protection for both Data and Voice TE mesh. Plus, they wanted to have guaranteed bandwidth in the event of failure for Voice traffic so they enabled bandwidth link protection for Voice TE mesh.
Last but not least, make sure that one identifies the paths which shares a similar fate. For instance, I can have two different IP paths, but they are following the same optical path. A router doesn’t have a knowledge of the optical path and could use the other IP path as a backup for the first IP path. If an event like Fiber cut happens, it will take down both IP paths so my backup path is also down with primary path. Hence the need to identify links like that as SRLG.
Case for Offline Path Calculation
A lot of operators favor an offline traffic engineering approach as it gives them more control on where the paths are placed. In this approach, all decisions for routing traffic are made by centralized servers. These servers have a global view of the network and resource availability and they can react to changes in the traffic loads. They are also very helpful in overcoming the restrictions of single IGP area limitations and can calculate the optimal path for Inter-Area and Inter-AS scenarios. One of the biggest advantages of Offline Path calculation is that they can perform complex calculations and modeling before and after traffic engineered path is applied to the live network. This can help in modeling situations like the impact of one or multiple link failures, latency induced due to bypass LSPs.
In essence an offline tool has a global view of the network, can employ sophisticated algorithms than CSPF and can consider a variety of different factors based on a particular network goal like to minimize maximum bandwidth link utilization, achieve protection for single and multiple failures, minimize LSP churn etc. And produce optimal placement of LSPs accordingly. A Router doing CSPF takes into account the LSPs originated by him. Also from an operational perspective, having an Offline Path computation reduces the chances of making major mistakes or it could be the other way as it increases the blast radius.
Let’s just take a look at an example. In below Fig.22, each link is 100Mbps and there are there LSPs from R2 to R8, R9 and R10. In the top part of fig. 22 if a failure happens between R6 and R7, then R2-R8 doesn’t have enough bandwidth left. In the below part of fig. 22 if we move LSPs from R2–R5 and R2–R10 to the bottom link, then R2–R8 has enough bandwidth in the case of link failure.
We have looked at different MPLS-TE design aspects of a large-scale network. By no means this article covers everything but I hope this will give readers an idea of the various options and aspects to keep in mind while designing/deploying MPLS –TE.
- MPLS-Enabled Applications: Emerging Developments and New Technologies, Third Edition
- Deploying IP and MPLS QoS for Multiservice Networks
- MPLS: Next Steps
- Network Recovery: Protection and Restoration of Optical, SONET-SDH, IP, and MPLS.
- QoS Performance Analysis in Deployment of DiffServ-aware MPLS Traffic Engineering
- Best Practices in Network Planning and Traffic Engineering
- Deploying Diffserv in Backbone Networks for Tight SLA Control
- A Practical Approach for Providing QoS in the Internet Backbone traffic Engineering