In this post we will be exploring the shortcomings of MVPN (Draft Rosen/RFC 6037), with a focus on how NG-MVPN technologies address these limitations.
The base specification for BGP/MPLS VPNs, RFC4364, only addresses unicast, and the first proposal for multicast support in BGP/MPLS VPNs is often known as Draft Rosen (which is now RFC6037). Most multicast implementations are based on the Draft and predate RFC6037. Draft Rosen is not fully consistent with standard MBGP unicast VPN and has the following limitations:
- Limited options for Transport: Draft Rosen only defines GRE or IP-in-IP for tunneling for multicast traffic and utilizes PIM to build trees. Most implementations based on the draft preferred the GRE encapsulation rather than IP-in-IP. The draft also limits the flexibility of other tunneling technologies like P2MP RSVP-TE, P2MP mLDP and MP2MP mLDP.
- Control Plane Scalability: In Draft Rosen, the CE router maintains neighbor relationships with the PEs. The PEs maintains PIM adjacencies with other PEs which are part of the same MVPN. PE adjacency has to be maintained on a per MVPN per PE granularity which presents some control plane scalability issues. For example consider a PE anticipating 100 MVPN services distributed across 100 PEs. Each PE will have to maintain 9900 (99×100) PIM adjacencies in addition to the adjacencies it needs to form to its directly connected CEs. In order to preserve 9900 PIM adjacencies, the PE would be sending approximately 330 PIM hello packets per second (using default 30s PIM hello timer), a significant stress on the PEs control plane. The numbers will get worse as the number of MVPN services or PEs increases. This isn’t a problem with Unicast VPNs where PEs has to only maintain a single BGP relationship with each PE or just a single relationship with the RR, regardless of the number of VPNs present on the CE.
- Availability: Draft Rosen does not specify any protection mechanisms like FRR or custom traffic engineered trees.
- Operational Consistency: Draft Rosen specifies the use of different protocols for Unicast and Multicast (PIM for Control and GRE for Data). Maintaining multiple protocols for unicast and multicast increases operational cost and complexity. Ideally we would leverage a unified control (MP-BGP) and data plane (MPLS) for unicast and multicast.
- Maintaining State in the Backbone: Draft Rosen requires P routers to run PIM, which means each P router needs to maintain at least the number of MVPNs because there is one or more MDT (Default) per VPN. The P router is further burdened if Data MDT’s are also used. If you compare this with Unicast VPN, P routers don’t maintain any per VPN state.
- Lack of Aggregation: Draft Rosen does not specify any capabilities to aggregate multiple MVPNs into a single P-Tree. In an ideal situation one would want to carry traffic from multiple VPNs into a single multicast tree.
The above limitations mentioned could also be considered as the wish list for NG MVPN. In this blog we will focus on an overview of NG MVPN and look at how BGP can be leveraged for transmitting C-MCAST signaling. We won’t be focusing on various P-Tunnels like P2MP RSVP-TE, mLDP as they are already covered in detail in various other places. Also I am assuming that the reader is aware of basic multicast concepts and has familiarity with Draft Rosen.
P = Provider
C = Customer
Introduction to NG-MVPN
Next Generation multicast VPN framework utilizes a BGP control plane and offers a variety of options for data plane encapsulation. BGP is responsible for signaling both unicast and multicast information between PEs, replacing the need for running PIM in the SP core. The elegance of a single control plane protocol is that it provides operational simplicity and ultimately lower OPEX. One important thing to keep in mind is that NG MVPN framework was not intended to replace Draft Rosen, but instead offer more choices and flexibility in addition to what Draft Rosen originally provided.
PMSI: P-Multicast Service Interface
NG-MVPN introduces the concept of PMSI (P-Mulicast Service Interfaces) to bring separation between the “service” and “transport” mechanisms. A PMSI is a conceptual “overlay” on the P-network that refers to a “service”. This “overlay” can take packets from one PE belonging to a particular MVPN and deliver them to other or all the PEs belonging to that same MVPN. A PMSI always has the scope of single MVPN, and a single MVPN can have one or more PMSIs. There are three types of PMSI:
- Multidirectional Inclusive PMSI (MI-PMSI) (Any PE ==> All PE)
- In MI-PMSI, Traffic sent by ‘any’ Ingress PE will be received by all other PEs in a given MVPN instance
- Unidirectional Inclusive PMSI (UI-PMSI) ( Particular PE ==> All PE)
- In UI-PMSI, traffic sent by a ‘particular’ PE will be received by all other PEs in a given MVPN instance
- Selective PMSI (S-PMSI) (Particular PE ==> Particular PE)
- In S-PMSI traffic sent by a ‘particular’ PE is delivered to ‘subset’ of PEs in a given MVPN instance
Default MDT and Data MDT from Draft Rosen are examples of MI-PMSI and S-PMSI respectively. We will use the term “I-PMSI” when we are not distinguishing between “MI-PMSIs” and “UI-PMSIs”
A PMSI can be instantiated by a number of different “transport” mechanisms. We will refer to these transport mechanisms as “P-tunnels”. A number of different tunnel setup techniques can be used to create the P-tunnels that instantiate the PMSIs, such as PIM (SSM, SM, or BiDir), P2MP RSVP-TE, mLDP (P2MP or MP2MP), or Ingress replication. A P-Tunnel can carry a single PMSI or multiple PMSI’s. Carrying multiple PMSI’s into a single P-Tunnel provides better aggregation but may be sub-optimal.
As I mentioned earlier NG MVPN introduces a BGP control plane in the provider network for handling multicast. MVPN BGP is responsible for three major functions:
- Auto-discovery: This is the process of finding all of the PEs participating in a given MVPN instance.
- P-Tunnel Signaling: This provides a way for PEs to tell other PEs what method it’s going to use for transporting C-Multicast traffic. Options could be P2MP RSVP-TE, P2MP mLDP,MP2MP mLDP, mGRE or Ingress replication.
- C-MCAST Route Signaling: This is a way of exchanging C-Multicast control plane state like C-Join, C-Prunes and C-Register messages between relevant PEs.
In order to achieve above functionality seven new BGP route types have been introduced.
The BGP Route Types 1, 2, and 3 include PMSI Tunnel Attributes. The formats of PMSI tunnel attributes are:
L= Leaf Information Required. If the Leaf Information Required flag is set to 1, then the PE/ASBR receiving the route must originate a new Leaf A-D route (Type 4).
Tunnel Type: This identifies the type of the tunneling technology used to establish the PMSI tunnel. It also determines the syntax and semantics of the Tunnel Identifier field
|Tunnel Type||Tunnel Type Name|
|1||RSVP-TE P2MP LSP|
|2||mLDP P2MP LSP|
|3||PIM-SSM Tree (P-S, P-G) with GRE Transport|
|4||PIM-SM Tree (*,P-G) with GRE Transport|
|5||PIM-BiDir Tree with GRE Transport|
|6||Ingres Replication, Set of Point to Point LSPs signaled via LDP or RSVP|
|7||mLDP MP2MP LSP|
Tunnel Identifier: Before a P-tunnel can be constructed to instantiate a PMSI, the PE must be able to create a unique identifier for the tunnel. The syntax of this identifier depends on the tunnel technology used. For Instance, a P2MP RSVP-TE identifier would be <P2MP ID, Tunnel ID, Extended Tunnel ID> and a PIM identifier would look like <Sender IP , P-Multicast Address> .
MPLS Label: It’s usually zero except for aggregate trees. The concepts of Aggregation will be covered in future posts.
At this point we’ve laid out why NG MVPN is needed along with the different BGP Route Types and various options for setting up P-Tunnels. Now let’s take a look at different scenarios and see how these new BGP routes fit in. We will be examining PIM-SSM and PIM-SM as the PE-CE protocol and from a provider perspective we’ll look at MI-PMSI (Default MDT) and S-PMSI (Data MDT) for tunneling C-Multicast traffic along with BGP for C-Route exchange.
1) PIM SSM as PE-CE Protocol
So in this scenario, our customer is using PIM-SSM and the provider is using BGP for C-Route exchange. The first thing the PEs need to do is discover other relevant PEs which are part of same MVPN instance. This can be achieved by Auto-Discovery.
Auto-Discovery and MI-PMSI P-Tunnel
Each PE advertises a BGP Intra-AS I-PMSI AD route. These routes are limited in scope to a single AS and are received by all other relevant PEs in the same MVPN instance. BGP route type 1 also contains the PMSI Tunnel Attribute, which tells what tunnel type a PE will be using for encapsulating C-MCAST traffic.
In Fig.1, Every PE advertises a route type 1 and it is imported by all other PEs with matching Route Targets. This allows every PE to know about every other PE participating in the MVPN instance including what type of Tunnel they will be using to encapsulate C-MCAST traffic. At this point depending on the type of tunnel, signaling will happen to setup P-Tunnel between PEs. In the example below, I am using P2MP mLDP. The goal of P-Tunnel setup here is to provide a “service” equivalent to Default MDT (i.e. any PE can send C-MCAST traffic to any other relevant PEs).
Now let’s assume that PEs on the receiver side receives a (C-S, C-G) Join. In our example below it is (126.96.36.199, 188.8.131.52). When the receiver PEs get the PIM Join, they perform a lookup in the routing table for 184.108.40.206 and find that PE1 is the next-hop to reach the source (220.127.116.11). Both PE2 and PE3 routers generate a BGP Route Type 7, with (C-S, C-G) details including a Route-Target set to PE1.
The Type 7 Route generated by both PEs will be the same (except for Originator ID) and sent to RR. As you might expect, the RR is going to reflect only the best path, and in our example it chooses PE3 as the best path, reflecting only that route to PE1. Since it matches the Route-Target, PE1 accepts the route and sends a PIM Join to CE1. Now at this point, the source site knows there are interested Receivers across the provider tunnel and MI-PMSI (Default MDT) tunnel is up. The source site will start sending the multicast traffic to the provider. PE1 gets the C-MCAST data and sends it to the P-Tunnel, which is P2MP mLDP in our case.
In Fig.3, The P-Tunnel makes sure that the C-MCAST is received by all relevant PEs, included in the same MVPN. You may have noticed that PE4 who doesn’t have any interested receivers is also getting the multicast feed, which is obviously not optimal. So how can we fix this? I’m glad you asked. That’s where S-PMSI (Data MDT) is going to help.
In S-PMSI, Receiver PEs who are interested in a particular (C-S, C-G) only receive C-Multicast traffic from the source. In order to achieve this, the source PE has to inform the receiver PEs that it’s going to use a new P-Tunnel for the (C-S, C-G) and the interested Receiver PEs need to join the new P-Tunnel. Once that happens the Source PE switches C-Multicast data over to the new P-Tunnel.
MI-PMSI (Default MDT) to S-PMSI (Data MDT) switchover can be configured with various criteria such as traffic thresholds (similar to Draft Rosen Data MDT creation). Let’s assume in our case the switchover from MI-PMSI to S-PMSI is configured to occur on the first packet received. PE1 will send another BGP Route Type 3(S-PMSI) advertisement to all the PEs announcing itself as the root of new P-Tunnel that will transport (18.104.22.168,22.214.171.124). After receiving this update, PEs with interested receivers will join the new P-Tunnel (P2MP mLDP). PE1 will then switch C-MCAST traffic from MI-PMSI (Default MDT) to S-PMSI (Data MDT), which means that only PEs with interested receivers will get the traffic.
In Fig.4, First PE1 advertises BGP route type 3(S-PMSI) to all the relevant PEs. Then PE2 and PE3 signal to join the selective P-Tree rooted at PE1 knowing that they have interested receivers. PE4 will not join the Selective P-Tree as it doesn’t have any interested receivers for (C-126.96.36.199,C-188.8.131.52). Once the tree is setup, PE1 switches the C-Multicast traffic from MI-PMSI to S-PMSI after a certain interval.
I think one thing worth mentioning here is that mLDP(P-Tunnel) signaling is initiated downstream towards the upstream PE which is analogous to how multicast signaling works. As soon as the interested receiver PEs know about the Root PE they initiate mLDP signaling towards the Root PE. In contrast to mLDP, P2MP RSVP-TE signaling is initiated by Root PE towards the leaf PE’s. So if the Provider wants to provide S-PMSI service using P2MP RSVP-TE tunnel there is a problem. Take a look back at Fig.4 and assume that the P-Tunnel is “P2MP RSVP-TE”. After PE1 advertises a Type3 (S-PMSI) route to every PE informing them that he is the Root PE, it is unaware of the interested Receiver PE’s. So it is unable to initiate P2MP RSVP-TE to interested receivers. This problem can be solved with the help of BGP Type 4 Leaf AD route. In Fig.6, PE1 advertises BGP route Type 3 with the “Leaf Info Required” bit set. When the Receiver PEs get the route, the interested PEs (PE2 and PE3) respond with a BGP Route Type 4 Leaf AD. Once PE1 discovered the leaf(s), it signals a P2MP RSVP LSP.
2) PIM-SM Mode as PE-CE Protocol
PIM-SM process is more complicated compared to PIM-SSM due to the Source and Receivers initially meeting at the RP and then switching over to SPT. The placement of C-RP is always very important. If the C-Receivers and C-RP are at different customer sites then things can get a little bit more complicated. So there are two ways to handle PIM-SM:
a) PE has the knowledge about the active C-Sources: There are two ways a PE can know about C-Sources:
- One of the PE routers acts as a fully functional customer RP(C-RP) for that MVPN.
- An MSDP session established between the PE router and the customer RP(C-RP) to convey information about multicast sources.
When a PE using any of the above methods learns of a new multicast source within that MVPN, it constructs a Source Active A-D route. This route is sent to all other PEs that have one or more sites of that MVPN connected to them. The downstream PEs with (*, C-G) receivers construct (C-S, C-G) Source Tree Join routes towards the upstream PE. From a Provider perspective this approach is simple, but from a Customer perspective there is a problem with both approaches. i.e. – SP is interfering with the Customer domain.
b) Signaling (*, C-G) join state via BGP: If a PE has downstream (*, C-G) receivers, it will send a Shared Tree Join route to the upstream PE en route to the C-RP. We will cover option b as it’s more complicated and probably more interesting. In Fig.7, C-Source is located behind PE1 and C-RP is located at a different site then source. C-Receivers are behind PE2 and PE3.
Auto-Discovery and MI-PMSI P-Tunnel:
Auto-Discovery and MI-PMSI setup process will be similar to what we saw for PIM-SSM. After this step every PE would know about other relevant PEs and the MI-PMSI (Default MDT) tunnel will be setup. At this point any C-Multicast control plane traffic like RP advertisements (Auto-RP, BSR) will flow over MI-PMSI. This will allow PEs to learn about the RP.
When the PEs receives a (C-*, C-G) join, they generate a Shared tree join BGP route towards PE where the RP is located. In Fig.8, when PE2 and PE3 receive a PIM join for ( * ,184.108.40.206) they perform a lookup for the RP address which is behind PE4. PE2 and PE3 generate a BGP shared tree join(Route Type 6) with the Route Target set to PE4. PE4 receives the route and forwards the PIM Join ( * , 220.127.116.11) to RP. At this point RP is aware that there are interested receivers behind PE4.
In Fig.9, let’s examine the steps involved once the source begins forwarding multicast traffic. (1) When source (18.104.22.168) starts sending traffic to G=22.214.171.124, (2) The first hop router CE1 informs the RP of the source’s existence by sending a (C-S, C-G) register to the RP. Once the RP receives the PIM Register message, it generates a (3)PIM Join (126.96.36.199, 188.8.131.52) towards CE1. (4)PE4 generates Source Tree Join BGP route towards PE1 after receiving the PIM join from RP. (5)When PE1 receives source tree join route it sends a PIM Join (184.108.40.206, 220.127.116.11) to CE1.
(6)PE1 also generates a Source Active A-D route as a result of receiving a Source Tree Join C-multicast route from PE4 for (18.104.22.168, 22.214.171.124) which is propagated to all the PEs of the MVPN. Type 5 BGP Source Active A-D routes are only applicable to PIM-SM. A PE will only generate a Source Active Route when it (PE1) creates a (C-S,C-G) state as a result of receiving a C-multicast route for (C-S,C-G) from some other PE(PE4). It also requires that the C-G(126.96.36.199) group is an ASM group, and the PE(PE1) that creates the state MUST originate a Source Active A-D route.
In Fig.10,PE2 and PE3 generate a Source Tree Join(188.8.131.52,184.108.40.206) targeted towards PE1 in response to the Type 5 Source Active A-D route. PE2 and PE3 generate the Source Tree Joins because they see a “match” to the Source Active A-D route. A match is considered when the (C-S, C-G) Source Active A-D route matches a given (C-*,C-G) entry, and if the C-G is same and the PE has originated a Shared Tree Join C-multicast route for the same C-G.
At this point PE2 and PE3 will start receiving traffic directly from PE1.
As you can see, the procedure for handling PIM-SM is definitely more complicated compared to PIM-SSM. Note that there may be a possibility to have (C-S,C-G) packets being sent at the same time on the PMSI by both the local PE connected to the Source and the PE connected to the site that contains C-RP. This would result in transient unnecessary traffic on the provider backbone. However, no duplicates will reach the customer hosts subscribed to C-G as the downstream PEs will drop the packet.
The NG MVPN Framework builds on the Rosen Draft and decouples the control and data planes. We saw how a Provider can use BGP for C- Route exchange along with various options for building P-Tunnels. We also looked at various BGP Route types for C-Route exchange and explored some scenarios using PIM-SSM and PIM-SM for PE-CE protocols. In future posts we will look at concept of aggregation and compare the various P-Tunnel technologies.