TAGS: |

Cisco 6500 Sort Of Gets Multichassis LACP Without VSS in 12.2SXJ Train

Ethan Banks

Markku Leiniö linked in one of his articles to a new mLACP feature for the Cisco Catalyst 6500, which naturally caught my eye. To create an etherchannel spanning multiple physical 6500’s (desirable to maximize link utilization and for redundancy), you previously needed to use the Virtual Switching System supervisor engine to create a VSS super-chassis. In testing, I didn’t love VSS. It was fussy to get set up with a finicky chassis loading order, dual supervisors in a chassis were not supported, and the code was unstable in the revision Cisco recommended to us at that time. Now, that was in 2009. I have heard some good things since that time from people who’ve used VSS in their production environments. However, there’s a risk that’s endemic to any technology where responsibility for a given task floats between two or more devices: split-brain.

Split-brain is a situation where the devices sharing the responsibility in question lose touch with one another. When a peer can’t see his mate, he assumes the worst, and asserts himself as the responsible party. Thus, you’ve got split-brain – two devices believing that they should be performing a given task. In a split-brain VSS super-chassis, Ivan Pepelnjak points out that at the least you’re going to have to cope with one member of the super-chassis reloading once VSS detects the split brain. You want to explain that to your boss? And probably his boss? Neither do I, which I why I never recommended VSS in the “five nines is not enough” environment I used to work in. Sure, in a perfect world, this is a non-issue, because redundancy and failover will work as advertised, and your upstream devices will never know the difference. When you find that perfect world, please let me know. Failover situations are never as simple as “this device lost power”. The situation that leads to a failover seems to always be ugly, and someone always gets hurt.

But I digress. The interesting thing to me is that Cisco is no longer saying “buy VSS” to allow an etherchannel to span multiple 6500s. Well, kinda. I mean, that *is* what Cisco’s saying in a certain sense,  but there’s some really big catches. The biggest catch in my mind is in illustrated in Cisco’s diagram below – do you see my complaint? It’s in that word “standby”.

  • First, mLACP is released with the 12.2SXJ code train. SXJ is a new code train, so as always, consider carefully where you are running this code. SXJ is far from proven in my mind, although the SafeHarbor program has given SXJ a “recommended” rating if you find that reassuring.
  • Second, although mLACP allows you to uplink portchannel members across two switches, only one of the links will be forwarding. The other will be in an LACP standby state.
  • Third, there are a number of hardware restrictions. Only Sup720 or Sup720-10G is supported, and they must be higher than PFC3A, as PFC3A does not support mLACP. VSS sups do not support mLACP (which makes sense). 6500 chassis’ containing the WiSM do not support mLACP.
  • Fourth, only a single uplink from the server to each switch is supported, so you can’t take a quad NIC and split it into 2 uplinks to each 6500’s with mLACP. Effectively then, all you can do is dual-home a server.
  • Fifth, this does not appear to be an interswitch technology, at least not by intention. This is purely for access-layer host uplinking. If you wanted to dual-home a switch, you could accomplish the same thing with rapid spanning-tree, presuming a well-designed STP domain where you understand your root bridge placement.
There’s more about mLACP, but those are the key points that let you know what Cisco is really giving you here: a dual-homed server with an active and a standby LACP link. mLACP is therefore a niche offering that gives a company who is committed to the 6500 at the access layer physical layer redundancy using LACP. I can see where it fits, but it’s less than I was hoping for. In my mind, the greatest design shortcoming is the single attachment permitted per server. I have to mention that Cisco says you could provision more than one link per switch in that the CLI won’t stop you, but they say that only a single link per switch is tested and supported. While redundancy is important, throughput is important as well, and I am seeing far more servers that can fill a 1Gbps pipe than not these days. If I’m stuck with active/standby, then I wish mLACP was minimally a 2×2 solution, instead of 1×2. 4×2 would be even better, as a lot of the guys I support are racking systems with dual quad NICs.

What do you think? Does mLACP solve any issues for you? Or are you going to pass on this one?

More Reading

mLACP for Server Access

About Ethan Banks: Founder & CEO of Packet Pushers, a podcast network for IT people. (NH)NUG organizer. Recovering CCIE #20655. Co-host of the Heavy Networking, Tech Bytes & N Is For Networking pods. Your network is not a democracy. Rig every election.