Tough Questions To Ask Network Vendors When Evaluating Products

Introduction

In my previous post, I proposed investing in careful planning to extract the maximum value from your vendor meeting. But what happens when the presentation begins? In this post, I’ll outline a few high-level questions and lots of in-depth questions to help you get a better understanding of the ‘real’ product that’s being sold.

High-level questions

Let’s kick off with a few high-level questions.

Who is this product for? - Listen carefully here. A new product, especially from a new vendor, should be targeted at a particular set of customers. This allows the company to gain a foothold in the market, and gain some initial success before expanding their product. A huge breadth of feature coverage in a startup is not always a good thing, and it can impact quality by spreading their resources too thinly. If their target market is ‘everybody’ I recommend the ‘run away’ strategy.

What specific problems will your solution solve for a network like mine? – A long customer list can help you gain some confidence in the vendor and how widespread their product is. But beware that the customer logo slide is saying, “Buy it. All your friends did.”  This is the ‘social proof’ cognitive bias at work. Don’t be wowed by that list of customers unless their networks closely resemble your own or they have similar needs to you.

How much does it cost? – Unfortunately, the final price only comes out at the end of a detailed solution design and negotiation phase. You can ask about list prices to get a ball-park figure and then use the detailed questions below to probe further. Watch for euphemisms such as ‘carrier-grade’, which may indicate high-reliability but always means ‘really expensive’.

What does that mean? – Don’t be intimidated by your vendors or afraid to look foolish in front of your peers. If you’ve never heard of a ‘patented 3-stage fabric’, then ask. Sometimes it is just marketing fluff or vendor-specific terminology. You shouldn’t be reluctant to ask for definitions and simplified explanations.

Detailed  Questions

The list of questions below assumes you are purchasing a layer-3 switch or router. It’s impossible to produce an exhaustive list, but these questions should get you thinking. A great way to build a better list of questions is to ask your peers for input. Use IRC, Twitter, the Packet Pushers forum, etc. Every engineer gets burned from misinterpreting product specifications at some point. However, I don’t know a single engineer who wouldn’t share their hard-won lessons to help others avoid the same pitfalls.

Every product has constraints, regardless of who the vendor is. I like to think you shouldn’t be cynical, but you should be highly skeptical. Your job is to identify the assumptions that you’re making about the product and ask targeted clarifying questions.

Hardware questions and tips

Review the datasheet – It tells two stories in one. Each number you see in there is a feature, but it is also a constraint. Look for the asterisks and all those details hidden in the footnotes.

Are there any license-enforced hardware limits? – Yes the router is ‘capable’ of holding 128K prefixes, but you’ll need a licence to use more than 64K. Wow, a virtual router on a 10G enabled server but…limited to 50Mbps throughput.

What is the oversubscription ratio? - Ask about the ratio for line-cards to fabric, but also ask about points of oversubscription between the port and the fabric interface. Sometimes the vendor can side-step admitting oversubscription with the help of….

Placement restrictions - The system handles full 40G throughput…*cough* as long as you buy four linecards and use a single port on each.

Are there any caveats about combinations of line cards, sups, fabrics, minimum software levels? - For example the Nexus F2 needs dedicated VDC, which may restrict you elsewhere.

Routing table capacity – It’s great to know the raw number of hardware switched prefixes, but watch out for “Longest prefix match versus Host routes”. /32 prefixes don’t need TCAM, so they can be handled in CAM or in hash-table in RAM. Beware of caveats though, such as no ECMP for /32s.

Hardware table capacity - How many prefixes? Watch for caveats on ‘prefix distribution’ where route table capacity varies based on prefix lengths. What is the effect on capacity when you enable IPv6 or uRPF?  Is the TCAM shared between Vlans/ACLS/PBR/indirect and direct routes? Does the system dynamically re-partition or do you hard-code and reboot (a la SDM profiles)? 

Legacy hardware support – Can you mix old and new generation hardware in a single chassis? Will the system performance drop-down to match the lowest capacity line card? For a chassis-based system, it’s probably best to have things consistently bad rather than having intermittent issues when you hit the limits of the lower capacity linecard.

ECMP / LAG size – How efficient is the ECMP/LAG hashing? A low number of hash buckets will overload some of your links if you fail back to a non power-of-two bundle size.

Transceivers - What transceivers are supported, which are not supported? Are 3rd party transceivers possible? Your vendor will not enjoy this question as they make a large margin on re-badged optics, but that shouldn’t be your concern.

Software questions and tips

Look closely at the feature roadmap - Resist the temptation to rush this slide. It’s easy to assume which features an entry-level product should have, but you’ll often see these basic features listed on the ‘next release’ slide.

Real world exposure - If there is a key feature you want, how long has it been baked-in by real customers? How long ago was the code shipped, and how many customers with networks like yours are using the features?

Proprietary features - This is not always a bad thing, but you need to clarify the open or proprietary status of each feature. The vendors are working to solve problems and innovate to provide new features. If no open standard alternative exists then you should investigate, but committing to a proprietary feature is a strategic and potentially costly decision. Push for more details if the vendor has ‘conditional compliance’ to a standard.

Licensing – Get a list of all licenses available for your solution and a full breakdown of what each license covers does. Closely examine all of these licenses, as you’ll often find key features in the advance licenses which you were assuming were present in the base license. Watch out for new generations of hardware which prompt a change in the licensing model.

Are there any restrictions or caveats on that performance figure? - You’re looking for hidden resource usage penalties. You can hold 16K ACEs in your ACL TCAM but this is 8K unique entries, mirrored across two slices to avoid outages during ACL application.  Also, if your ACL is ‘x’ lines long, you may need to burn ‘x’ ACL TCAM slots for every interface you apply the ACL on.

Wrap-up

What important questions do you feel are missing from this list?  Add your voice to the comments.

John Harrington
John is an experienced data center engineer with a background in mobile telecoms. He works as a network test engineer for a large cloud service provider, and is gradually accepting that he's a nerd. He blogs about network technology and careers at theNetworkSherpa.com. You can reach him on twitter at: @networksherpa
John Harrington

Latest posts by John Harrington (see all)

  • Peter McCreesh

    Another great post John. One that has burned me in the past was: “…it supports 20Gbps throughput…” for me to find out later it actually meant 10Gbps in both directions rather than a full 20Gbps full duplex. Also add in a few ACLs and your figures hit the floor. Take average packet size into account and the figures sink again. Keep up the great work

    • http://thenetworksherpa.com/ john harrington

      Hey Thanks Peter. Yeah I forgot about the simplex/duplex marketing sneakiness. Great point on the packet sizes too, especially when people are quoting packet forwarding performance figures.

  • http://umairhoodbhoy.net/ Umair Hoodbhoy

    These are some really good pointers. Great writeup!

    • http://thenetworksherpa.com/ john harrington

      Thanks Umair!

  • Kyle Bader

    Just ask if the support ipv6. It’s truly shocking how many vendors can only switch v6 in 10G ToR gear. Where’s my OSPFv3?

    • http://thenetworksherpa.com/ john harrington

      Yep, fair question. Asking that question clearly and firmly will help influence the vendors roadmap, albeit very slowly.

  • http://twitter.com/cloudtoad Derick Winkworth

    Would like to add something about the term “oversubscription.” This is an ambiguous term that means different things to different people. Sometimes, “linerate” or “non-blocking” are the terms vendors will understand. They like to worm their way around this too when they know they can’t support this. It’s also worth asking if they can do this for both small and large packets (believe it or not, sometimes performance decreases as packets get larger).

    They like to throw out the argument that “is their such a thing as non-blocking? if you have four 10G ports passing traffic to a single 10G port, isn’t that blocking by nature?” When they throw that out there, you know they are bullshitting you. What vendors need to understand is that 4to1 or 8to1 “oversubscribed” line cards are a sure way for someone up the ordering chain (a manager or some other non-technical person) to screw the engineering staff.

    The second comment is about MTBF. These are largely pointless numbers in the year 2013. If MTBF is 50k hours or 300k hours, it doesn’t matter. 50k is nearly six years. Many other factors, within that six years, will exert larger impact on your network. Changes in requirements, traffic patterns, etc. Also, there is a thing called “resiliency” which you may want to work into your network design.

    The last comment is about latency. We are long passed the threshold where +/- 100, 200, or 300ns of delay matters for 99% of networks out there. Don’t let vendors waste your time going into great detail about this unless you are running RDMA or HPCs or such other things.

    • http://thenetworksherpa.com/ john harrington

      Hey Derick,

      I love getting feedback like this. It proves that you get back more that you give. I like your approach of looking at those ‘supposed’ features and calling them out as non-features or just plain irrelevant.

      I take your point about using non-blocking over oversubscription. It’s crazy to believe that some vendors would try the ‘what’s really non-blocking’ line. What they’re describing is output-contention, but angles like that from a vendor leave a very bad taste in the mouth. I like it better when the vendor explains clear what they do and don’t do, rather than trying to redefine what the customer wants.

      Thanks again for the feedback, much appreciated.

      /John H

  • http://twitter.com/PaulALeroux Paul A. Leroux

    Leave the technology out of it. I try to get my customers to ask their other partners about re occurring costs such as annual maintenance and support. People always want premo gear but ROI is never met when TCO is through the roof.