Network, Interrupted

Dear Cisco and Juniper:

Its been a good run, Cisco.  Thank you for the CCIE.  Thank you Juniper for the JNCIE.  I learned a lot about networking because of you.  But you are irrelevant now.

Right now, I can buy a 64-core server with 768GB of RAM from HP for a mere $57k.  That includes 8TB of storage, by the way.  Since Google has graciously contributed Receive Packet and Flow Steering (RPS/RFS) to the linux kernel, multi-core packet processing is available to network applications.  Combine this with the offloading of SSL and deep-packet inspection to GPU(s) and you have a platform that is far less-expensive and potentially far more scalable than vendor dedicated silicon.  That is, what little vendor silicon is left these days in network gear since Broadcom and the likes seem to be in every vendor’s equipment. Tell me what happens when we put a broadcom chip right on the server bus and OpenFlow becomes a device driver?

As if that wasn’t enough 6Wind is optimizing packet processing on x86 platforms.  Did I  mention Advantech’s products?

Cisco, Juniper… listen to me: your days are numbered.

No wonder Google rolls their own.  I’m about to roll my own too.  My company delivers IP applications and services to thousands of customers’ private networks.  We need lots of customizable NAT.  Not in throughput.  Not even in concurrent sessions.  We need lots of *configured* NAT.  Thousands of rules.  50k right now as a matter of fact in just one spot of our network. Yet Cisco’s ASR plaform supports just 16k configured static NATs.  Their solution?  Buy more ASRs. You’ll need an RP2 w/16GB of RAM.  Probably an ESP-40.  A pair for redundancy.  So we are talking six of those.  Juniper is no better.  An SRX-5800 (fully loaded, this is basically a super-computer) only supports 8k of configurable static NATs.  A Cisco 7206vxr supports 16k. Nobody wants to jumble their rule base up trying to spread these out across three or four kinds of NAT.  Simple 1:1 bidirectional mappings are the way to go.  And you only support 8k of them.

You know what happened in the server world when companies started using VMs?  They found, in the end, that they had far more VMs than they ever had physical servers.  Its easy to launch another VM.  Network virtualization is no different.  With the virtualization of network functionality comes an explosion in the number of configured virtual elements.  Scale isn’t just about throughput, concurrent flows, or the size of route-tables.  Its about the ability to support tens of thousands or even hundreds of thousands of configured elements.  This is going to happen even in companies smaller than Google.  You should have seen this coming when the carriers starting using software to manage MPLS.  You should have seen this coming when Google started rolling their own.  You just should have seen this coming.

So here is a far less expensive option for my NAT problem:  A big server running linux and iptables in containers.  It could be FreeBSD running PF in jails too.  It will probably be more scalable, with some tuning.   I’ll admit that ALG support is awful with both iptables and PF, but how hard would that be to fix?  Thanks to NetPDL we have a way to describe protocol data-units in XML.  A general purpose proxy could be built that could read in NetPDL descriptions.  Redirect traffic on specific ports to this proxy, and in turn this proxy will offload DPI to a GPU.  Adding additional protocol support is as simple as writing an XML description in NetPDL and sending it to the proxy process. That, or just build that functionality right into PF or iptables.  Guess what, there are already libraries (and source code) available to help us…  Imagine not having to wait for ALG support or fixes.  It brings a tear to my eye.

Suricata already has CUDA support.  Its a matter of time for Snort to have it.  Either way, IPS will be open-source and running on commodity hardware soon enough.  Should I mention how obvious it would be to offload SSL or any other kind of encryption to GPUs?

Lastly there are piles of APIs and libraries available with open-source tools that people can use to make real progress with the usability of network functions.  No one is going to be hand configuring iptables rules.  Soon applications will describe what they need in API calls and those functions will be created in the network inside of containers that can logically exist anywhere.  Need a firewall?  Launch an lxc with iptables.  Need an IPS?  Launch Suricata in an lxc.  Need a virtual router or virtual switch?  Quagga or linux bridging in an lxc.  All those containers will be attached to Nicira’s Open vSwitch in the main host (if switch hardware isn’t already integrated right in the server).  All of these elements will be arranged and configured according to API calls, not an army of router-monkeys.  Containers (or jails) will be monitored for CPU utilization and be automatically moved between servers to maximize resource utilization… thanks to OpenFlow and the already very capable state-sync libraries available to us for Apache (oh yeah, that will be containerized too) and iptables.  Packet capture in this network will be cake:  We can run tshark in any jail or container.

But hey, there might be a market for you making really fast dumb OpenFlow switches to interconnect the servers housing all these great functions…  If you get in on the ground floor now you might be able to squeeze a little extra margin out of it before you have to start competing with Casio.

Sincerely,
CloudToad

Cloud Toad
CloudToad is adrift on the great sea of network serenity. - CCIE #15672 (RS, SP) JNCIE-M #721 Twitter: @cloudtoad LinkedIn: http://www.linkedin.com/in/derickwinkworth - Derick's opinions are his own and do not reflect those of the company he works for.
  • Tommy P

    I love this article. Going in the books as an all time favorite.

  • http://twitter.com/ioshints ioshints

    Fantastic!

    • Sud

       What is it with you server guys, he we go again, revisting the old topic of using servers to replace routers and switches, you guys have no idea.

      • JohnF5

        Sud, you must be living in the 2000’s still mate! Can’t you see how our industry is flooded by cert kings who still use GNS3 to lab and stage their designs and study for their CCxx exams? If we all just saw the light and moved to OpenFlow networking, then there is no need for certs as all vendor networking configurations will be done on a nice GUI interface just like your home router has, but obviously more powerful as it will be a high-end server.

        • David ‘ninja’ tran

          Disgraceful concept. I really need a pump.

          • Charter Ops Guy

            Guys, relax. Cisco and Juniper are too BIG and established to succumb to flimsy Open flow. It won’t happen guys. trust me. 

        • Gurjarn

          Its ok John…we the Network certs will learn coding just enough to implement SDN…wanna challenge?

          Conventional Developers need to get a solid hold on networking if they are in a poisiton to implement a robust SDN…mere classroom hand’s on is not going to help….they need to work like dedicated Net admin atleast for a good 3 yrs…

          Server/VM guys will need to get on both Dev and NWking.

          And we certs will stick to Cisco/Juni once they come up with something similar….

          The WAR isn’t started yet..dont cry VICTORY until its atleast half way thru……you never know what will happen next..buddy…

          Cisco ventured into UCS to counter the SDN stuff….get it now?

  • phptoosoon

    Speaking of Snort…isn’t this what SourceFire has been doing for years with their pay-for-appliances?

  • Rob Bergin

    We tried this with Vyatta and some 1U x86 servers we had – it puked on our NAT table – the ASR at 16k entries isn’t awful but an x64 with some RAM should be able to run massive NAT tables (ours were VoIP related).

  • @LSP42

    Pretty powerful words! Does this mean before long I should quit my CCIE R&S journey and just become a coder instead? After all, it’s mostly about IOS anyway (speaking broadly) right?
    SDN has the ability to re-shape the industry as we know/suffer it in almost everyway. Certainly exciting for sure. Derrick, you’ve certainly made a splash with this!

    • @LSP42

      Sorry for misspelling your name. Phone trying to be clever!

  • Jason

    spot on. nice article. just watch the evolution of cisco UCS. This will evolve to a platform exactly as you describe in this article. *unified* computing system.

    • http://twitter.com/dvorkinista mike dvorkin

      It was going to, but the challenge is that it requires software, especially management software, which is cisco’s weak point.

      • Gurjarn

        As of today..Cisco may be weak..but the confusion reigning on the SDN front will give Cisco/Juni ENOUGH time to respond…

        Let me restate..you wont get dedicated devlopers learn C ro C++ just to move packets from point A to B…YOU WILL HAVE TO BANK on NETADMINS to do that…

        And SDN is hell bent on killing the Giants and Network enggs rather than an honest approach to solve existing issues….they will miss their step…in the rehotic…also closing the network field will do the economy no good.

        Also remember, Microsoft has the huge potential to upstage this whole SDN…by bundling everything into windows…it won’t take long before its deep pocket will upstage VMware….and then the only player who can save VMware is CISCO….get THAT…

  • js

    You make good point in your article that there’s a lot that can be done with commodity x86 hardware and open source software.  However, I think you may be simplifying the problem a bit.  Your article implies that the only problems needing solutions in the networking arena are firewall, IPS and NAT and maybe some routing thrown in the mix.  This may be true in enterprise and data center networks, but there’s a lot of other people who buy from Cisco and Juniper (and others) that don’t need any of these capabilities.  I also think you are naive if you believe that Cisco and Juniper (and others) haven’t been thinking about SDN and the impact it will have on their business.  They would be foolish not to have something cooking in their R&D departments.  

    • fanboy

      I second this comment. There are other problems today that must be fixed today with “traditional” network. And big vendors like Cisco have already cooked something with similar concept like SDN, check out the direction from Cisco Prime, ASR9000V, Nexus etc. And let’s say even they are too dumb to come up with good product, they can just acquire successful start up companies when they see enough business justification. This is the answer why many big companies can stay in the market for many years, even when the technology has evolved. The big companies can simply adapt to the changes

  • http://twitter.com/brandonrbennett Brandon Bennett

    There is still a scaling problem that cannot be answered by just software.  I do agree that we are looking at smaller “pods” but regardless you are not going to replace your core switch, your mpls backbone, or other massive scale device with a piece of software.

    Google rolled their own WITH asics.   There is a significant cost to that if you are not at the size of google.

    I want to see this article AFTER you have completed the project and not before.  I do believe it to be possible but it’s not going to be a walk in the park.  This is partially what you are buying when you choose a big name vendor.

  • http://twitter.com/maschipp Michael Schipp

    This is truly a fascinating read (as have been the many
    tweets leading up to this).  I can see
    the power here.  It is interesting to me
    as I think we will see the exact opposite arrive next month or so.  Pushing more feature/services into the switch itself.

     

    Both could offer some huge performance gains here.

  • sh0x

    Right on!

  • Steve B

    I’m guessing a fair percentage of this post was devils advocate and flagging up what is possible now. In which case it was great, some cool ideas indeed!

    Now my only real world reply would be “Roll your own networking infrastructure based on servers running bespoke open source software?” followed by much laughter ;-)

    It fails the supportability test for 99% of enterprises/govt departments in that the skills are in such small supply (Bit like the old OSPF vs IS-IS issue) and with no such thing as vendor support to fall back on I can’t see this being a widespread solution. In a highly skilled (Like you!) niche I’d love to hear about it. But just don’t then leave the place you set it up at as you’d be the support contact for infinity!

    • Mike Fratto

       ‘xactly

    • http://etherealmind.com Etherealmind

      There are a number of cloud providers using platforms like this instead of hardware appliances. Sure, it’s early days for people like us who are risk averse  but Cloud Providers don’t provide guarantees so they can easily afford t take risks with new ideas like this. 

      If it works, and so far it _is_working, then they’ve saved tens of millions in capex plus tens of millions on opex. 

      This type of design WILL change the industry by lowering vendor pricing. So I’m in favour of this approach. 

  • JS

    Hmm.. Sounds good in theory … But count how much power you need for the 64 core HP server, and how much it is going to cost you, and compare that with an ASIC solution. Also count that the server will not run in more than 30C ambient, where as the network gear can go to 60C and count the power for the air condition … When you are done, let’s see what is going to be cheaper over the life time of the service. Yes, the margins might be too large in the hardware solutions, and they need to go down, but the ratio is not 10 to 1 as in the power equation.

  • http://twitter.com/Network2501 Network2501

    I stopped reading after NAT. 

  • Allen Baylis

    Like this article …. lol

  • http://twitter.com/nish Nish Vamadevan

    Fail over is something to watch out for when it comes to  vendor independent platforms…

  • Rbauer957

    When comparing Cisco with a company like google and any other ISP and saying they will soon disappear, your forgetting about the HUNDREDS and THOUSANDS of enterprises that use these products. I dont know about Juniper and couldnt really care less if they went out of business so I cant commend about them. I DO know there are a TON of companies that still need to run enterprise equipment and dont have the need for 16k static nat entries if any for that matter. They are here to stay for quite some time.

    • Gurjarn

      SO the dev will then go thru the coding and fix it?

  • deezknuts

    so what happens when this custom deployment breaks?  what are you gonna call for support when the datacenter is on fire?

    • JohnHarr

      Deezknuts,  That’s a good question.  If you’re going to develop a platform then you need your developers to carry a pager.  This effort and cost of should be priced into your effort up front.   But it may still make sense to develop a homegrown device. 

      Assuming you’ve done your financial due diligence, and the price is still right, then it’s just a procedural change.  You page your devOps guys instead of paging Cisco/Juniper TAC. 

  • Mikker Gimenez-Peterson

    This is an interesting article, and though I am studying for my Cisco certification(thus obviously invested in the traditional model) I long-term predict the opposite of what you are saying, even though the current trend seems towards big-iron and I don’t know that the traditional definition of a routed and switched will be where we go.  I think that instead of having robustness being defined as one redundant big-iron, people will purchase “throw-away” modules or “boxes” that will be dedicated to a resource.  Each will have 1 10G link, 1 Power supply and either a bank of CPUs, memory or disk space.  Redundancy will be in software.  If any one of these devices die, the cloud monitor will let you know it’s gone, and you’ll ship it back to the manufacturer in exchange for a new one.  Now, I’m not sure if this will be in a blade architecture, which might have the network inards built into a backplane or with physical devices that plug into switches.  Either way, cloud already virtualizes the concept of a computer, why is it still built upon what is basically a traditional computer.

    2c

  • Gurjarn

    Betting everything on software is scary….stability has to be prvode beyond human doubt before enterprises can even look at them..

    I have customers who need to know my network setup in depth before they trust us and make business with us….

    SDN cant even draw a decent topology diagram…see googles pathetic SDN wan diagram….

    Also using coding just to move packets between point A and B is sickeining….

    Server and VM admins are also on the same table as Netadmins when it comes to coding. so they wont venture into coding.

    Regular app develpopers find this job too boring and mundane to make a carrer with.

    So MY GUESS IT WILL BE UPTO NETADMINS to grab these jobs…network software pro.

    Also what happens to the knowledgebase if one developer moves on?

    As of now, the ppl working on the SDNs are IOS developers moved from Cisco, Juni etc…to google, facebook etc.

    and Enterprises don’t really need a HIGHLY elastic DC anyways..as once the DC is desigend and implemented ground up…there won’t be a need to change it at the core too often..

    also even now there are low priced non-cisco/juni routers and switches that work well..but enterprises are still sticking with Cisco/Juni.

    Ongoing support and maintenance too need to be dealt with when workign with SDN,

    Enterprise Management needs to understand at a high level how the data is moving from point A to Point B,

  • Ahmed

    So how do you feel about this trend now that you work for JUNIPER?

    Did you just buy last-minute tickets for the TITANIC? :-)

  • Pingback: A Cloud Without IPv6 | Keeping It Classless

  • cloudycloud

    How many times has cisco failed you on their CCIE tests?

7ads6x98y