The Sad State of Data Center Networking

Something about next-generation Data Center networking has been bothering me lately.  For a while now, there has been this nagging sensation somewhere in the back of my mind telling me that its just not adding up.  While I was at Network Field Day 3, I was able to connect some of the dots and form a picture of what it is thats been scratching away in my mind.

1. The Inconsistent Network

The Inconsistent Network

So, the first thing is that Data Center Networking is fragmented.  Even with next-generation technologies.  We have two very broad movements evolving with respect to DC networking:  The advent of the DC “overlay” (vCNI, Nicira’s STT, VXLAN, NV-GRE) and the advent of the “fabric” (FabricPath, Q-Fabric, VCS, ProgrammableFlow, etc.)  Sadly, these two things are not integrated at all.  Virtual-machines talk with each other through the overlays, but to get out to the network they transit an 802.1q trunk into the fabric and ultimately over to their default gateway.  Worse yet, “fabric” vendors are developing features in the fabric that integrate with VMware APIs so they can track or otherwise do nifty things with VMs in the fabric.  In other words, the state of affairs is such that vendors are accepting the “Inconsistent Network” as a fact of life and they are developing features around it.

This is nonsense.  What we need is a consistent end-to-end approach to DC networking.  Why buy a snazzy high-powered fabric that interfaces with a different network (the vSwitch and its overlays) via VLANs and talks to the hypervisor via APIs?   Why not  have the routers participate in the overlays, and ditch the VMware API integration with the DC fabric?  It just so happens that Cisco told us during Network Field Day 3 that soon you will be able to terminate VXLAN on an ASR on something like an SVI/IRB interface.  This gives us something like the next diagram:

More Consistent Network (but not totally consistent)

In this case, we have greater overall consistency in the Data Center.  Overlays are mapped to VRFs and are contiguous from the WAN edge to the hypervisor.  The underlying fabric has no reason (as far as I can tell) to integrate with the hypervisor through any APIs.  vMotion within the Data Center has no VLAN dependencies.  It sure would be great if certain network vendors supported, at the very least, vCNI, and as a bonus VXLAN, termination into SVI/IRB logical interfaces.  Juniper’s Data Center architecture has a Q-Fabric setup with MXs sitting at the top.  This seems like an obvious fit for that design.  *cough*

Note: Oh, hey you SDN companies… know what would be really cool?  An SDN controlled switch that could terminate vCNI, VXLAN, NV-GRE, and STT and translate that to Q-in-Q VLANs for uplink into a router or MPLS PER.  65k virtual-networks is quite a lot, actually.  We’re not all Yahoo.  We don’t all have 10 million VMs and 200k tenants.  Might I also suggest possibly mapping DC virtual-networks to MPLS VPNs directly on these edge switches for those large-enterprises and Service-Providers that use MPLS? Don’t let this suggestion get in the way of the Q-in-Q thing though, one step at a time…

I see this, perhaps, as one of the least painful ways to move towards some kind of coherent network *from where we are at now.*  I am a believer in the overlay.  I think this is the right way to go if we wish to have network nodes in the hosts (i.e., the vSwitches).

Note: Another option is to connect the virtual-machines directly to the fabric via something like EVB and push the virtualization down to the edge of the fabric.  Another intriguing possibility is the use of SR-IOV NIC in the server that can be controlled by the external switch or perhaps by an SDN controller.  Regardless, if we are to not do overlays in the host, then eliminate the vSwitch as it is useless in this case.

2.  “Single System Image” Network Systems and the “Price is Right” Fail Horn

Most people don’t know or realize this, but the new thing in DC networking is “Single System Image.” (Google that.) Well, SSI-”like” functionality anyway.  It seems every vendor is shooting for the DC core to be as self-contained and maintenance-free as possible.  This isn’t a bad thing, but the message being delivered by some vendors is muddled.

 

Suppose you are a vendor selling an SSI-like network system.  I can think of several:  Juniper’s Q-Fabric,  NEC’s ProgrammableFlow,  and Brocade’s VCS just to name a few.  There are two things you should be prepared to explain fully and clearly at the drop of a hat (in addition to the magic of your solution):

A.  If you are selling the collapse of the traditional 3-Tier architecture, be prepared to explain how your SSI system will be physically structured.  If you do not understand why this is important, than figure it out quick.  Hire someone to help you understand if you need to.  Large Data Centers need to follow some kind of maintainable plan.  Cabling your Data Center all willy-nilly is a ridiculous suggestion.  Structure will be required.  As it turns out, this is not a new concept.  There are things like Clos and Hypercube out there that people might consider using.  Simpler structures are also possible, like, well, a 2-tier or 3-tier architecture.  Wrap your head around that for a minute.  As a network vendor “liberating us from the Tyranny of the 3-Tier architecture,” you are in a position where you must understand these things and be able to explain them in relation to your product.

B.  Be prepared to explain how your system interfaces to the rest of the world.  Does it support traditional protocols for interaction with other networks?  If not, then how have others integrated your solution into their networks?  Be prepared to explain both Layer-2 and Layer-3 in this regard.  We’re not asking if you use traditional protocols “within” your SSI network system, we are asking about integrating with *external* networks.

The inability to explain either A or B will result in the Price is Right fail horn being played loudly for all to hear.

Note: Don’t be surprised if people ask how local-loop protocols such as LLDP  and OAMs operate, or how ARP works with respect to your system.  Its bound to get asked.

3. Everyone else doesn’t know how to network better than the network guys

Considering #1 and #2 above, it should be apparent that, in spite of claims to the contrary, many of the new “next-gen network” companies really do not know networking any better than the traditional networking companies.  In fact many challenges that the overlay folks are facing are identical to the challenges the MPLS community has faced (and overcome repeatedly).  The following things are not new:

  • Overlays <– MPLS and GRE interconnected VRFs, VPLS instances, and pseudowires.  PVCs.
  • Port profiles <– 802.1x and MAC-based port configuration
  • Centralized control mechanisms <– ATM, Frame-Relay, Virtual-chassis (some, anyway), and a multitude of tools exhibiting similar functionality such as OER.  (I’ll concede, none do this at the data-plane level across multiple tuples though…)

Frankly, what we’ve seen out of the recent DC “innovations,” for those of us that have been alive and breathing in this field for the last 17 or more years, are partial thoughts and incomplete ideas.  Some of us are excited by some of the ideas (no really, I am quite excited about SDN/Openflow and the whole idea of overlays), but we wonder how you will overcome known challenges with these approaches and how you will participate in internetworking.  However, getting through to someone that is willing to talk about this and not just rant endlessly about the revolution and the end of the “mainframe era” of networking is extremely difficult.

A recent blog post from an SDN advocate proclaimed that Network guys just don’t get “software.”  No.  Many of us do get it.  Networking has never been about just hardware or speeds and feeds.  There is a rich software feature set that controls the network.  Many of the features are not dependent on hardware at all for their functioning.  So a lot of us get software as it pertains to networking.  The issue, on the other hand, of others not understanding networking is quite apparent:

  • Its confusing and disconcerting when you proclaim that you are ending the tyranny of the 3-tier architecture but you offer no explanation as to how the network will be structured or worse you claim that it can cabled willy-nilly.  I have news for you.  I can cable a set of ethernet switches or IP routers any which way from Sunday and they too will still pass packets.
  • When you claim that networking needs a VMware and the best that you produce is a closed layer-2 overlay system that still interfaces with the rest of the world over 802.1q trunks through a vSwitch, you have failed at networking.  Networking is about interconnecting.  Not being an island unto yourself.
  • When you build what is essentially an SSI-like network system and it has no commonly known features at all for interfacing with the rest of the world, and you can’t explain how it will connect to the rest of the world, you have failed at networking.  “VLANs” as an answer, by the way, is fail-horn worthy.
  • Whenever you claim that you are doing something new that traditional networking has not done before without knowing that, in fact, it has been done before, you look foolish.

At some point all of the hype about “never think about the network again” and “the network just disappears” will fade and reality will set back in.  Networking might be easier in some ways, but there many new and remaining challenges.  We will still have to carefully plan our network, and we need to understand how we will do so with your product if you intend for us to use it.

Conclusion

I think there is a reluctance to dive head-first into any new Data Center solution for many companies in part because right now its a messy pile of discombobulated garbage.  We need something coherent, not a pile of poorly interconnected acronyms.  Also, your revolution is bound by physics, the need for maintainable and repeatable processes, and the need to connect with other networks to achieve “inter-networking.”  So lets put down the pipe, roll our sleeves up, and see what some of these interesting ideas might do for us.

 

 

Cloud Toad
CloudToad is adrift on the great sea of network serenity. - CCIE #15672 (RS, SP) JNCIE-M #721 Twitter: @cloudtoad LinkedIn: http://www.linkedin.com/in/derickwinkworth - Derick's opinions are his own and do not reflect those of the company he works for.
  • Mike

    I agree 100%. So many times while reading about all of this “new hotness” I would say, “Oh, so it’s just like XYZ” or “How is this that different from ABC?” Maybe I’m just old.

    I am excited about SDN/OpenFlow like you are though.

  • Chiradeep Vittal

    To get out of the network, you could use virtual machines hosted on a hypervisor (while waiting for the hardware vendors to provide overlay-friendly interfaces). There’s a couple of problems there of course: scaling up and scaling out these edge VMs. Scale up is a solvable problem, up to a certain extent. Scale out is also possible for a multi-tenant DC (edge appliance VM per tenant).

    • Brad Hedlund

      Yep. You can also deploy a physical x86 appliance to encap/decap north-south traffic and bridge between virtual an non-virtual servers. This too can scale horizontally on a per overlay segment ID basis.

      • Brad Hedlund

        Oh, and it’s also possible to have a physical or virtual firewall or LB devices providing the gateway functionality. The FW/LB vendors could run the open vswitch on the southbound interfaces.

        • http://twitter.com/cloudtoad Derick Winkworth

          yeah, this is true, I was thinking that right after I posted it.  It doesn’t have to be a router or VRF on the left, just any device sitting on the edge of the overlay network that interfaces to the rest of the world…

          • Jon Hudson

            But I don’t think this plays well to VMware. Hardware license costs are no joke. I can justify them by saying each VM eithers makes or saves my department money.

            If I am burning HW cycles to do encap/decap (not easy for a multipurpose CPU) do I bill the network group for services rendered?

            In smaller single budget places no big deal. It’s probably cheaper to run a software FW over a physical one.

            Could also be interesting to have a nice CNA do some HW offload

  • Jason

    If it was consistent and nicely structured , you could not sell yet more API kits and abstraction layers to manage your clouds to make it simple again.

    • Florian Heigl

      Yep, like comparing badly integrated (HP Software / Datacenter Orchestration) vs. well integrated (Cisco UCS).
      The former is hack this hack interface there and makes them so much more money than something that *works*

  • Bombardo_Ruiz

    I think the vendors see the writing on the wall. The software point-products that talk to VMware APIs will be the only value-adds soon :) At least for ethernet switching…..

  • Massimo

    Derick, I am not deep enough on these topics to take a strong opinion on this (yet). Most of the things you say seem to make sense. 

    Having this said I have also to note that, as I was going through your article, I could spot a lot of the similarity I used to hear from mainframe and Unix people when talking about x86 virtualization. Clearly the mainframe solved in 1980 many of the issues we are still battling today with x86 virtualization… that isn’t really playing to their favor anyway and we probably don’t even need to leverage that knowledge (probably). Perhaps the use cases and the requirements are different. I still remember those slides where Unix people would show they could have 64 cores in a partition and with x86 virtualization you could have only 4 cores. Technically correct… but who cares when 95% of people are using 1 or 2 cores anyway in any given VM? 

    Not sure if you follow me but perhaps we need to abstract a bit from what we saw so far. Perhaps what we learnt yesterday in networking isn’t useful, isn’t applicable, isn’t required in tomorrow networking? I am just thinking loudly here… perhaps this is a total nonsense. 

    I wouldn’t certainly say that traditional networking vendors “don’t get it” while those new SDN software vendors “get it”. This is more like an innovator’s dilemma type of problem. 

    I bet at IBM we could have built a much better (than VMware) x86 virtualization back in 1990 leveraging our know-how. The real question is why didn’t we? I think we know the answer. 

    Can we make this server / network parallel? I don’t know… may be it’s way off … but .. 

    Massimo Re Ferre’ (VMware). 

     

    • Jon Hudson

      Awesome!

  • Jon Hudson

    Derick!!!

    Man, that was very well said.

    Here’s ( I think ) the problem

    To really solve the problem you describe companies have to make some pretty huge bets. In a “burn the ships” sort of way.

    Many companies are either small enough that a wrong bet will end them so they are a wee bit paranoid or they are big enough that they have large established groups/divisions/etc so invested in the “old” way that it’s very hard to build anything REALLY disruptive.

    But GREAT points. I think we all need to step up.

    I will say this though, this is just the beginning, it will be 7-8yrs before all the dust settles from this party.

  • Khanammar888

    Great article, very insightful Derick