I recently completed a design and lab scenario that uses Cisco DMVPN as a backup to a primary MPLS WAN (I’m still planning the implementation).
This design, combined with the requirements for multiple (more than two) hubs and the use of BGP in a single DMVPN cloud, was difficult to find neatly packaged on the Internet. That’s because many implementations of DMVPN are either with DMVPN as the primary connection with a single hub or dual hub design, or the routing protocol is EIGRP, or the implementation uses multiple DMVPN clouds, or all of the above.
If you’re seeking a globally scalable primary or backup WAN connectivity solution, I hope this post serves as a good starting point to mold this design into your own environment.
Customer: Global enterprise with 20 locations across three regions (North America, Latin America, and Europe).
Network Requirements: Use global/regional MPLS backbone as primary WAN connectivity method, with DMVPN backup. DMVPN spokes should have a regional primary hub with secondary hubs also based on location. In this scenario, it will look like this:
|Region||Primary Hub Location||Backup Hub Location|
Exceptions: Some smaller sites with a handful of users don’t warrant an MPLS circuit, but these locations still need corporate network connectivity and redundancy.
If you’ve found your way here, you’re probably familiar with Ivan Pepelnjak’s blog, which is rife with delicious knowledge on DMVPN, BGP, MPLS, SDN, <insert more letters here>, etc. I strongly recommend his articles on DMVPN (and other topics) like this one on scaling BGP-based DMVPN networks, or this one on the differences between Phase 2 and Phase 3 DMVPN.
These, coupled with some Cisco configuration guides, other blog posts (namely this one by Dan Williams), and my trusty GNS3 and VIRL instances, led me to this particular design.
This design employs Phase 3 DMVPN. As you’ll see in one of the above articles, Phase 3 DMVPN differs from Phase 2 in that the RIB (routing table) isn’t exclusively used for spoke-to-spoke connectivity.
In Phase 3, NHRP redirect is used to dynamically update a spoke router’s NHRP redirect cache, which is how a packet will actually be forwarded to another spoke. In essence, Phase 3 allows spoke-to-spoke communication based on NHRP forwarding rather than the RIB. And this means a much cleaner routing configuration.
We will also employ BGP as the routing protocol in the DMVPN cloud. Why? First, because BGP in the WAN is infinitely scalable. Second, even though EIGRP is commonly used for DMVPN, it would mean supporting a third routing protocol (BGP for MPLS, OSPF for LAN, EIGRP for DMVPN), and that’s just nonsense.
BGP also gives us expanded flexibility on how we can affect routing decisions in the WAN. In our DMVPN cloud, we will run eBGP between the hubs and spokes, and eBGP between the hubs. To get around the issue of requiring static neighbors, we will use dynamic neighbors in BGP.
We will run all of this Phase 3 DMVPN and BGP magic inside of a single DMVPN cloud, with three hubs. Impossible? Naw.
Meat & Potatoes
Here is a visual representation of the lab:
The top sites represent regional headquarters, i.e. data centers, i.e. DMVPN hubs. The bottom sites represent spokes. There are corner cases where some sites connect via DMVPN only, but that doesn’t really matter for the overall design. In those cases, you’d just remove the MPLS routers and those routes would no longer be in the IGP.
You may have already noticed a slight problem. You might say, “Um, your core switch is going to see both MPLS and DMVPN routes as equal cost when you redistribute into the IGP.” Very true, and we don’t want that in this scenario. Remember, this is a failover design. It’s an easy problem to fix, though. In our case, OSPF is our IGP. When we redistribute from BGP to OSPF on the DMVPN router, we’ll set the seed metric to something higher than we set the seed metric on the MPLS router.
Before I dive into the configurations and failover testing, the info below will help you know what’s what in this lab topology:
Site ASN & IP Assignments
|Site||Site ID||MPLS ASN||DMVPN ASN||Site Supernet|
Show & Tell
Let’s look at the sw-atlanta node first. This node is considered the “core” of the network at this site, so it only participates in OSPF and knows nothing of BGP. It’s a good place to see which routes are actually being used. When we’re looking at routes, know that I’ve configured each site so that the link to the MPLS router from the core is 10.x.0.6, and the link to the DMVPN router from the core is 10.x.0.2. X represents the site identifier noted in the table above.
sw-atlanta#show ip route ospf [truncated] O E1 10.2.0.0/16 [110/52] via 10.1.0.6, 00:04:31, FastEthernet0/1 O E1 10.3.0.0/16 [110/52] via 10.1.0.6, 00:04:31, FastEthernet0/1 O E1 10.4.0.0/16 [110/52] via 10.1.0.6, 00:04:31, FastEthernet0/1 O E1 10.5.0.0/16 [110/52] via 10.1.0.6, 00:04:31, FastEthernet0/1 O E1 10.6.0.0/16 [110/52] via 10.1.0.6, 00:04:31, FastEthernet0/1 O 10.247.1.0/30 [110/2] via 10.1.0.6, 00:04:43, FastEthernet0/1 O E1 10.247.1.4/30 [110/52] via 10.1.0.6, 00:04:01, FastEthernet0/1 O E1 10.247.1.8/30 [110/52] via 10.1.0.6, 00:04:31, FastEthernet0/1 O E1 10.247.1.12/30 [110/52] via 10.1.0.6, 00:04:01, FastEthernet0/1 O E1 10.247.1.16/30 [110/52] via 10.1.0.6, 00:04:01, FastEthernet0/1 O E1 10.247.1.20/30 [110/52] via 10.1.0.6, 00:04:31, FastEthernet0/1
So we see all remote site summary and point-to-point routes pointing towards the MPLS router. Here’s what happens if I send Bolivia’s MPLS down:
sw-atlanta#show ip route ospf [truncated] O E1 10.2.0.0/16 [110/52] via 10.1.0.6, 00:10:05, FastEthernet0/1 O E1 10.3.0.0/16 [110/52] via 10.1.0.6, 00:10:05, FastEthernet0/1 O E1 10.4.0.0/16 [110/52] via 10.1.0.6, 00:10:05, FastEthernet0/1 O E1 10.5.0.0/16 [110/52] via 10.1.0.6, 00:10:05, FastEthernet0/1 O E1 10.6.0.0/16 [110/101] via 10.1.0.2, 00:00:04, FastEthernet0/0 O 10.247.1.0/30 [110/2] via 10.1.0.6, 00:10:17, FastEthernet0/1 O E1 10.247.1.4/30 [110/52] via 10.1.0.6, 00:09:35, FastEthernet0/1 O E1 10.247.1.8/30 [110/52] via 10.1.0.6, 00:10:05, FastEthernet0/1 O E1 10.247.1.12/30 [110/52] via 10.1.0.6, 00:09:35, FastEthernet0/1 O E1 10.247.1.16/30 [110/52] via 10.1.0.6, 00:09:35, FastEthernet0/1 O E1 10.247.1.20/30 [110/101] via 10.1.0.2, 00:00:04, FastEthernet0/0
Notice the 10.6.0.0/16 and 10.247.1.20/30 routes are now pointing towards the DMVPN router.
Let’s dig into the DMVPN configuration.
Here is the Atlanta DMVPN router tunnel interface configuration:
interface Tunnel100 ip address 10.254.1.1 255.255.255.0 no ip redirects ip mtu 1440 ip nhrp authentication brospf ip nhrp map multicast dynamic ip nhrp map multicast 192.0.2.6 ip nhrp map 10.254.1.2 192.0.2.6 ip nhrp map multicast 192.0.2.10 ip nhrp map 10.254.1.3 192.0.2.10 ip nhrp network-id 1 ip nhrp holdtime 600 ip nhrp nhs 10.254.1.2 ip nhrp nhs 10.254.1.3 ip nhrp registration no-unique ip nhrp shortcut ip nhrp redirect ip tcp adjust-mss 1360 tunnel source FastEthernet0/0 tunnel mode gre multipoint tunnel key 1234 end
There are two primary differences between this configuration and a Phase 2 DMVPN hub configuration:
- ip nhrp shortcut & ip nhrp redirect commands
- NHRP mappings and NHS statements for the other hubs
#1 is what allows the Phase 3 magically fast spoke-to-spoke communication. If you want more information on how exactly that works, I suggest INE’s blog post on Phase 3 DMVPN.
#2 allows us to keep everything in a single DMVPN cloud, and provide hub redundancy in the event a hub MPLS connection goes out.
The great thing about using BGP in this scenario is that we can specify which hub we want to use, per spoke, per region. Here’s how we do it:
denver-dmvpn#show run | s router bgp router bgp 65100 bgp log-neighbor-changes aggregate-address 10.4.0.0 255.255.0.0 summary-only redistribute ospf 1 metric 20 match internal external 1 external 2 route map advertise-local-only neighbor 10.254.1.1 remote-as 65101 neighbor 10.254.1.1 weight 100 neighbor 10.254.1.1 route-map block-local in neighbor 10.254.1.2 remote-as 65102 neighbor 10.254.1.2 weight 50 neighbor 10.254.1.2 route-map block-local in neighbor 10.254.1.3 remote-as 65103 neighbor 10.254.1.3 route-map block-local in
Here we have an eBGP neighborship with each hub, and simply set the weight for each neighbor in a waterfalling preference. In Denver, we want the primary hub to be Atlanta. But if Atlanta goes down, it should fail over to Berlin. And if Berlin is also down (and the apocalypse hasn’t begun), it should fail over to Panama.
Let’s test out the Phase 3 bread and butter. What I expect to see in a spoke-to-spoke communication test is the first packet to traverse the hub, then the NHRP shortcut will take place and every subsequent packet will go directly to the spoke. Observe:
Newcastle to Bolivia first flow:
newcastle-sw#traceroute 10.6.1.1 source Loopback1 1 10.5.0.2 32 msec 56 msec 52 msec 2 10.254.1.1 100 msec 72 msec 68 msec 3 10.254.1.6 76 msec 10.6.0.1 80 msec 68 msec
As you can see, we transit the hub on the first run. This is expected.
Newcastle to Bolivia second flow:
newcastle-sw#traceroute 10.6.1.1 source Loopback1 1 10.5.0.2 28 msec 56 msec 56 msec 2 10.254.1.6 52 msec 52 msec 52 msec 3 10.6.0.1 72 msec 72 msec 80 msec
This time, we go straight to the other spoke. Pretty awesome! A show dmvpn on the Newcastle DMVPN router shows us a dynamically established tunnel:
newcaslte-dmvpn#show dmvpn [truncated] Interface: Tunnel100, IPv4 NHRP Details Type:Spoke, NHRP Peers:4, # Ent Peer NBMA Addr Peer Tunnel Add State UpDn Tm Attrb ----- --------------- --------------- ----- -------- ----- 1 192.0.2.2 10.254.1.1 UP 00:33:08 S 1 192.0.2.6 10.254.1.2 UP 00:33:08 S 1 192.0.2.10 10.254.1.3 UP 00:33:08 S 1 192.0.2.22 10.254.1.6 UP 00:03:34 D
A big key to the success of this design is loop prevention. Without loop prevention, for example, Newcastle would advertise its prefixes out both MPLS and DMVPN, get redistributed into Berlin’s OSPF network, and then get redistributed back out to BGP. These routes would then make it back into Newcastle’s BGP table, with a next hop of Berlin. Bad juju. To fix this, we simply use route maps to filter which prefixes are allowed in and out:
What’s great is that we can use a single ACL to define the site’s networks, and then call that ACL in both the inbound and outbound redistribution route-maps. Here is an example of this on the Atlanta DMVPN router:
ip access-list standard local-routes permit 10.1.0.0 0.0.255.255 permit 10.247.1.0 0.0.0.3 ! route-map advertise-local-only permit 10 match ip address local-routes route-map block-local deny 10 match ip address local-routes route-map block-local permit 20 ! redistribute ospf 1 metric 20 match internal external 1 external 2 route-map advertise-local-only ! redistribute bgp 65101 metric 100 metric-type 1 subnets route-map block-local
The configuration on the MPLS router is almost identical:
ip access-list standard local-routes permit 10.1.0.0 0.0.255.255 permit 10.247.1.0 0.0.0.3 ! route-map advertise-local-only permit 10 match ip address local-routes route-map block-local deny 10 match ip address local-routes route-map block-local permit 20 ! redistribute ospf 1 metric 10 match internal external 1 external 2 route-map advertise-local-only ! redistribute bgp 65001 metric 50 metric-type 1 subnets route-map block-local
Notice that we drop the metric on the BGP route redistribution into OSPF from the MPLS router, to ensure the IGP sees the MPLS routes as preferred.
There are some caveats to this design for our real topology. For example, we have sites that are a) MPLS only, b) DMVPN only, or c) MPLS and DMVPN, but converged on one router.
The first two aren’t a huge deal, until you talk about an MPLS only site in North America with an IPsec VPN tunnel backup (private cloud) to Atlanta. What happens when the MPLS circuit for that site goes down, and you want to access it from a site in Europe? Makes one think, and is probably worth its own blog post.
The corner case of DMVPN and MPLS on one router is interesting. Remember, we’re running BGP in both MPLS and DMVPN, and Cisco doesn’t allow more than one local ASN configured on a router at once. In short, you can use the local-as neighbor statement to specify an alternate local ASN for a specific neighbor. This is meant to be used as a migration tool, but it works. Interestingly, BGP will modify the AS path to add the alternate local ASN to the path as the last ASN transited. For redistribution into OSPF and path preference, we match the route tag (alternate ASN) and set the metric value to what we want. Another subject worthy of its own post.
Cisco’s DMVPN is a fascinating WAN technology that provides great flexibility in connecting your remote offices. This design could easily be adapted to a DMVPN-only design, i.e. not using MPLS at all as a primary connection. Or, you could use MPLS and run DMVPN over it as an overlay network. This is essentially the “secret sauce” of Cisco’s IWAN.
Thanks for reading. If you made it all the way through, feel free to leave comments for any questions you might have, or you can hit me up on the Twitters at @serialmeh.