OpenFlow Switching Performance: Not All TCAM Is Created Equal

The fundamental promise of OpenFlow – that any switches and any controllers can be used to build a network – was perhaps always a fantasy. Every protocol definition has corner cases unanticipated by its authors; OpenFlow is no exception. And to make it implementable across as wide a range of hardware and software systems as possible, the OpenFlow 1.0 standard has a significant number of optional features; even those that are required can be implemented in different ways on the underlying hardware, with very different results.

This article begins to examine some of the different ways that OpenFlow is implemented. I’m going to use five different switches as examples:

The heart of OpenFlow, the 12-tuple of fields against which each packet is matched to determine how it will be handled, is the first place where implementations begin to differ. The standard does not make any of those fields optional, and indeed all production OpenFlow switches support them. The problem arises when one considers what ‘support’ really means.

For a purely virtual implementation like Open vSwitch, there is no possibility of hardware acceleration of the matching process; every flow is handled by the system CPU, as is everything else, and the expectation is that performance will be determined by CPU power. Hardware switches are another story. In their natural mode of operation, they can move packets between all of their ports at full line rate, so it’s a reasonable assumption that they can do the same when used with OpenFlow. Sadly, that’s not the case.

Every hardware switch has a finite amount of TCAM, critical for implementing line-speed forwarding, and it can hold only a finite number of flows. A typical switch supports on the order of a thousand 12-tuple flows; in our list the number ranges from 750 for the NEC PF5820 to 4000 on the MLX. But there’s a twist. The PF5820 and some other switches have the ability to do a huge number of flows if they’re only required to match the Layer-2 fields; in that case the TCAM will support more than 80,000 entries. A controller might easily overtop the 750 flow limit unless it’s aware of the situation and can use Layer 2 matching for some traffic. Once the limit is reached, the switch might refuse to accept more flows, or it might try to fail more gracefully and process them in software; of course, it’s anyone’s guess which way the controller wanted the flows to be handled, with a 50-50 chance of getting it right.

And it gets worse – for some hardware, the number of flows is only one kind of limitation. It’s important to keep in mind that most switches weren’t designed with anything like OpenFlow in mind, especially when their interface ASICs were laid out. The chips do a fine job of switching, and frequently handle basic Layer 3 functions as well, but OpenFlow asks for a great deal more. The Pica8 and NEC both support 12-tuple flows in hardware. The MLX can handle all 12 matches, but not all at once; each port has to be preconfigured in either Layer 2 or Layer 3 mode, which determines which fields are active. The HP has the most complex story of all. First, the rules are slightly different depending on the chip’s generation (HP calls them v1 and v2). Considering just the v2 rules, for the sake of some simplicity, we find that a flow will be in hardware if it matches on:

  • VLAN ID, VLAN priority and/or input port
  • VLAN ID, VLAN priority, input port and/or any of the IP and Layer 4 fields – but only if the Ethertype is IP (0×800)
  • VLAN ID, VLAN priority, input port and/or source/destination Ethernet addresses – but only if the Ethertype is not IP

Other combinations will happily be accepted, but they’ll be handled in software. That’s not so bad for some traffic, but if a controller chooses to push a file transfer or a video stream by using a flow match that doesn’t fit one of the three hardware categories, the switch stops being a wire-rate Gigabit Ethernet performer and becomes a 1 Mbps chokepoint. That limit is configurable by adjusting the packets per second that the CPU is allowed to handle, but increasing it much beyond the default of 1000 runs the risk of overwhelming the processor.

None of these limitations are fatal; meaningful work can be done with any of these switches, if only the controller is aware of them and its choices of flow rules are made appropriately. Sadly, the OpenFlow standard doesn’t provide any mechanism for the switch to communicate this nuance of its capabilities. There is a way for the controller to ask each switch what it can do, but since the standard requires all of the match fields to be supported, they aren’t included in the response. And there’s no way for the switch to talk about what it can do in hardware or software, or what combinations of fields are available, or how many flows it can support.

It’s obvious that the controller’s job becomes much more difficult when these quirks and customizations are considered; the value of a standard mechanism for programming flows on a switch is substantially eroded. And unfortunately the situation only gets worse as we look deeper into the protocol, but that’s a matter for the next installment.

About Bill Owens

Bill has had his hands in networks since 2400 baud was fast, but lately he thinks that things like DNS, IPv6 and OpenFlow are more fun. During the day he helps take care of a statewide optical/IP network. You can find him on Twitter as @owens_bill and lurking around lots of different network-related mailing lists.

  • Art Fewell

    If vendors werent so maniacally focused on DC … how different these problems would be if we started out at the less critical and less demanding campus/branch and then move towards the core as the technology matured – which also allows administrator skillsets to evolve in a healthy and productive way. The same way as WLAN and VoIP … and despite the focus on DC, I still think campus SDN will mature more quickly than the physical DC fabric, its simply less complex and with wireless controller’s being as mature as they already are, this is a slam dunk. Glad vendors are getting there now, but this whole thing has been myopic, if a vendor had a solid campus SDN solution a year ago, which was technically possible, it would have brought a lot of meat and direction to temper the hype, and that vendor would be riding high right now. This will happen, but the first mover advantage is long gone now. Missed opportunity for Cisco competitors, and frankly for CIsco too … if they had offered a fully automated campus a year ago their base would be churning like crazy today bumping their cash funnel way up allowing them to more greatly subsidize their UCS and other adjacencies. Myopia and missed opportunities … this is the reason we need a strong, open community to check and balance the hell our of vendors and get the vision that we, the consumers that pay for this stuff, want.

  • Michael Gonnason

    I have been wondering when this would come to a head.

    Almost reminds of the days of ATM and how it promised to fix everything, but then in implementation it was expensive and convoluted.

    Time will tell us how good the hardware gets.

  • Craig Mills

    Dear Bill,
    I am a Technical Product Manager at HP Networking, covering switch software and SDN campus. I was glad to see the inclusion of the 3800 in your Openflow switching performance post. I would like to share a few details on the Openflow implementation on HP Networking devices. The 3800 running KA_15_10_0003 or later has an optional setting that will reject any Openflow rule that will not be performed in hardware. The Switch will inform the controller that the rule is not a valid rule. While this doesn’t resolve the issue your article addresses, it does help to mitigate the complexity of our hardware vs. software rule processing.

    I also couldn’t help but notice a commenters concern about SDN support in the campus. The 3800 is a campus edge/distribution switch with PoE+ support. While the Data Center is receiving much of the SDN attention, there are campus solutions making their way to market.

    Thanks
    for your time,
    Craig

  • Rob Sherwood

    Thanks for the article drawing out some of the differences in implementation – I think the more that people understand in this space, the better decisions they will be able to make.

    One quick technical correction: the OpenFlow protocol does _not_ require that all switches be able to match on all fields. This is a commonly misconception and you are certainly not the first person saying it, but if you look at all of the versions of OpenFlow from 1.0 and onwards, there is a message called ofp_table_stats that describes each table and the specific fields that the table can match on. From the beginning, switches with different capabilities, including ones you’ve mentioned here have used this table to express their different capabilities.

    Now, it puts more work on the side of the controller writers to adhere to the capabilities advertisement and to change their logic accordingly, but the message is there none the less.

    Great article,

    - Rob Sherwood
    CTO, Controller Technologies
    http://www.bigswitch.com

    • owens_bill

      Rob, thanks for making those two points; I’ve spend the last hour researching, and it’s helped me learn yet more about the standard – there’s always some new aspect to look at, it seems. I’ve based my opinion about the requirement for the full 12-tuple match on two points: the spec makes explicit statements about flexibility in other areas (actions, stats, etc.) but doesn’t do so for the match fields; and the ofp_switch_features structure includes the actions bitmap, but doesn’t have an equivalent for match. I did look through the spec for an explicit statement, though, and didn’t find it. I’d be very curious about the prevailing opinion amongst implementers whether the matches are all required.

      Even after doing some research I’m not sure that I understand the statement about ofp_table_stats, though. I have to confess that I’d never looked at that part of the protocol before tonight, but the description in the spec and the openflow.h header file seems to indicate that the wildcards field in that struct indicates which match fields support wildcards, not which are supported. I’m not sure it would be possible to express the distinction between a switch that supported, for example, a choice of exact or wildcard match on VLAN_ID, from one that was trying to indicate that it could not match on VLAN_ID and therefore the field was effectively wildcarded regardless of the match that might be requested in a specific flow.

      And unfortunately there’s no discussion anywhere in the standard, as far as I can tell, about the distinction between software and hardware processing, or slow and fast path, or however you might want to refer to it. I’m sure that there is sensitivity about specifying exact performance information, but since hardware acceleration makes such a huge practical difference it seems to me that some indication is worthwhile. Of course that indication ought to extend to other aspects of the switch behavior, but as I said in the article that’s a topic for, well, another article ;)

  • tamil

    Thank you for sharing valuable firsthand knowledge.
    And about your next installment, would you please talk about actions?
    I suppose it might be more to do with ASIC than with TCAM, but I’m curious because over the past week I have been stunned to realize the current status of HW support for 1.0 flow actions leaves way too much to be desired.
    Is there any chipset at all that does simple L3 field flow_mod actions, such as OFPAT_SET_NW_DST? It’d be great if I could hear positive reports. -Tami

  • layer4down

    Fascinating article, Bill. You make some interesting points.

    While the scope and pervasiveness of SDN technologies into the network infrastructure is still being tossed around (and I imagine for some time to come), it’s my opinion that we as an industry might consider _not_ comparing the implementation of OpenFlow into different products as we might other standard protocols like BGP or STP. Not that we need more bureaucracy here, but I wonder if the industry as a whole just might benefit from some types of technology certification body (even the ONF), similar to something like the WiFi Alliance (ala Wi-Fi CERTIFIED logos). After all, the implications of SDN to networking as a whole a far greater, wider, and deeper than that of some common networking protocol. Thoughts?

    Looking forward to future installments.