Below is a very generic WAN diagram. It consists of L3 MPLS links in blue and point-to-point links in red. Not all these links actually exist (unless you’re made of money), but bear with me.
My cohorts and I recently installed a pair of Riverbed Steelheads between Site A and our remote data center, as you can see in the above drawing. The performance improvement was so large my boss actually had users telling him “things seem faster.” I’ve never had that happen before, have you? Anyway, with a response like this you can imagine how eager my boss was to deploy WAN acceleration to all our sites.
Standard practice dictates that each site have another Steelhead installed locally. In the above drawing, that would require buying and installing two more devices – one each for sites B and C. There is but one problem.
Riverbed’s Steelhead 7050 is the single most expensive item I’ve ever successfully held in my hands. (FYI, do not try to lift a Nexus 7010 by yourself).
Here’s where Distance X and Distance Y in the above drawing come into play. Site C is a long ways away from Site A and the data center. Hairpinning all traffic from Site C to the data center through Site A wouldn’t make much sense, and bosses understand that (hopefully). However, for all bosses, there exists some distance X and Y such that you will be told
“Hey, network guy/gal, I’m not buying another Steelhead for Site B. You’ll have to use the one in Site A.”
This is the opportunity you’ve been looking for, right? To prove you can do more with less? Right?
You can fill in your own details, but in our case Site B was just a few miles from Site A and the two were connected via fiber in the street. This didn’t sound like much of a challenge until I started mocking up some pictures on a napkin. The napkin looked like the drawing below.
Since a Steelhead (or a WAAS/SilverPeak/Certeon) only has a limited number of interfaces, if you want to deploy the appliance inline you must do so at some point of aggregation. For us it was between the core and edge switches. Given that all our edge connectivity was kept separate from the core switches, traffic between Site B (labeled Remote Site above) and the data center would simply bounce through Site A (labeled HQ above) and never touch the Steelhead. It was around this time I started whispering under my breath “Curse you, Distance X!”
WCCP redirection was off the table for other reasons, and we didn’t want to disrupt optimization for existing clients. We needed a way to get traffic to bounce through the Steelhead to the core switches, then back through in the “correct” (LAN-to-WAN) direction. And vice versa for the return traffic.
…
…
What do you in your test lab when you need to simulate the presence of several switches, but you only have one? Multiple VRFs!
We ended up turning the L3 links between the core and edge switches into 802.1Q trunks that carried two VLANs. On the edge switches, one VLAN replaced the existing link, and the other was placed in a new VRF (we called it PASSTHRU). The core switches had no knowledge of a new VRF and happily forwarded packets from subinterface A to subinterface B. Packets were effectively bouncing through the Steelhead to the core, then back through in the correct direction! Logically, the new configuration looks like the drawing below. Notice the different VRFs on the edge switches.
Clients in Remote Site B can now make use of the existing Steelhead in Site A. Penny-pinching worked – this time.
In Riverbed’s case, you will see a lot of connections being passed through with reason code FROM_WAN. These are the SYN packets going from the WAN side to the LAN side, before being bounced back to the other VRF. I assume other vendors would give you similar notices.
This probably isn’t screaming “elegant solution” to you, so I think it’s useful to list the pros and cons of such a solution.
Pros
- Saved a lot of money
- Caused no outage (as long as you’re careful about renumbering the two core switches’ interfaces)
- Quick to implement (1 night vs. waiting for a new Steelhead to arrive)
- Does not require WCCP redirection (remember, WCCP is just like HSRP and EIGRP – proprietary)
- Supports enabling WAN optimization for other edge services with a quick outage (i.e. remote access)
Cons
- Additional load on the Steelhead (can your box handle it?)
- Additional load on the edge switch – multiple OSPF databases in our case
- “Unnecessary” load on the core switches – how much traffic is Site B sending to the data center?
- You won’t find this in any Riverbed Deployment Guide
I’m very interested in hearing other people’s thoughts on this implementation. If you have any more pros/cons to add to the list, please share.
Note: These drawings were all done using Dia, you’ll have to forgive the odd Steelhead stencil. If you know of a way to import .vss files into Dia, please let me know.
Old Core Switch Config (Cisco Nexus 7010)
interface port-channel2
description to Edge Switch (via Steelhead)
no shutdown
ip address 1.2.3.3/29
ip ospf message-digest-key 1 md5 3 ABC123
ip ospf network point-to-point
ip router ospf 1 area 0.0.0.0
ip pim sparse-mode
Old Edge Switch Config (Cisco 3750G)
interface Port-channel1
description to Core Switch (via Steelhead)
no switchport
ip address 1.2.3.4/29
ip ospf message-digest-key 1 md5 ABC123
ip ospf network point-to-point
ip pim sparse-mode
interface Gi1/0/1
description to MetroE Router
ip address 1.1.1.1/31
router ospf 1
no passive-interface Port-channel1
New Core Switch Config (Cisco Nexus 7010)
interface port-channel2
description to Edge Switch (via Steelhead)
interface port-channel2.3002
description Edge Switch's default VRF
encapsulation dot1q 3002
no shutdown
ip address 1.2.3.3/29
! be sure to change the VLAN of the Steelhead's in-path address
ip ospf message-digest-key 1 md5 3 ABC123
ip ospf network point-to-point
ip router ospf 1 area 0.0.0.0
ip pim sparse-mode
interface port-channel2.3003
description Edge Switch's PASSTHRU vrf
encapsulation dot1q 3003
no shutdown
ip address 10.10.10.10/31
ip ospf message-digest-key 1 md5 3 ABC123
ip ospf network point-to-point
ip router ospf 1 area 0.0.0.0
ip pim sparse-mode
New Edge Switch Config (Cisco 3750G)
vlan 3002
name default_to_core
vlan 3003
name PASSTHRU_to_core
!
ip vrf PASSTHRU
!
interface Port-channel1
description to Core Switch (via Steelhead)
switchport trunk encapsulation dot1q
switchport mode trunk
switchport trunk allowed vlan 3002,3003
switchport nonegotiate
spanning-tree portfast trunk
int Vlan3002
description Optimized connection
ip address 1.2.3.4/29
ip ospf message-digest-key 1 md5 ABC123
ip ospf network point-to-point
ip pim sparse-mode
interface Vlan3003
description PASSTHRU connection
ip vrf forwarding PASSTHRU
ip address 10.10.10.11/31
ip ospf message-digest-key 1 md5 ABC123
ip ospf network point-to-point
ip pim sparse-mode
interface Gi1/0/1
! be sure to move this interface to the PASSTHRU vrf
ip vrf forwarding PASSTHRU
ip address 1.1.1.1/31
router ospf 1
no passive-interface Vlan3002
router ospf 2 vrf PASSTHRU
router-id A.B.C.D
! you must manually specify a new OSPF router ID for this process to start
no passive-interface Vlan3003


