In my previous post, I proposed investing in careful planning to extract the maximum value from your vendor meeting. But what happens when the presentation begins? In this post, I’ll outline a few high-level questions and lots of in-depth questions to help you get a better understanding of the ‘real’ product that’s being sold.
Let’s kick off with a few high-level questions.
Who is this product for? - Listen carefully here. A new product, especially from a new vendor, should be targeted at a particular set of customers. This allows the company to gain a foothold in the market, and gain some initial success before expanding their product. A huge breadth of feature coverage in a startup is not always a good thing, and it can impact quality by spreading their resources too thinly. If their target market is ‘everybody’ I recommend the ‘run away’ strategy.
What specific problems will your solution solve for a network like mine? – A long customer list can help you gain some confidence in the vendor and how widespread their product is. But beware that the customer logo slide is saying, “Buy it. All your friends did.” This is the ‘social proof’ cognitive bias at work. Don’t be wowed by that list of customers unless their networks closely resemble your own or they have similar needs to you.
How much does it cost? – Unfortunately, the final price only comes out at the end of a detailed solution design and negotiation phase. You can ask about list prices to get a ball-park figure and then use the detailed questions below to probe further. Watch for euphemisms such as ‘carrier-grade’, which may indicate high-reliability but always means ‘really expensive’.
What does that mean? – Don’t be intimidated by your vendors or afraid to look foolish in front of your peers. If you’ve never heard of a ‘patented 3-stage fabric’, then ask. Sometimes it is just marketing fluff or vendor-specific terminology. You shouldn’t be reluctant to ask for definitions and simplified explanations.
The list of questions below assumes you are purchasing a layer-3 switch or router. It’s impossible to produce an exhaustive list, but these questions should get you thinking. A great way to build a better list of questions is to ask your peers for input. Use IRC, Twitter, the Packet Pushers forum, etc. Every engineer gets burned from misinterpreting product specifications at some point. However, I don’t know a single engineer who wouldn’t share their hard-won lessons to help others avoid the same pitfalls.
Every product has constraints, regardless of who the vendor is. I like to think you shouldn’t be cynical, but you should be highly skeptical. Your job is to identify the assumptions that you’re making about the product and ask targeted clarifying questions.
Hardware questions and tips
Review the datasheet – It tells two stories in one. Each number you see in there is a feature, but it is also a constraint. Look for the asterisks and all those details hidden in the footnotes.
Are there any license-enforced hardware limits? – Yes the router is ‘capable’ of holding 128K prefixes, but you’ll need a licence to use more than 64K. Wow, a virtual router on a 10G enabled server but…limited to 50Mbps throughput.
What is the oversubscription ratio? - Ask about the ratio for line-cards to fabric, but also ask about points of oversubscription between the port and the fabric interface. Sometimes the vendor can side-step admitting oversubscription with the help of….
Placement restrictions - The system handles full 40G throughput…*cough* as long as you buy four linecards and use a single port on each.
Are there any caveats about combinations of line cards, sups, fabrics, minimum software levels? - For example the Nexus F2 needs dedicated VDC, which may restrict you elsewhere.
Routing table capacity – It’s great to know the raw number of hardware switched prefixes, but watch out for “Longest prefix match versus Host routes”. /32 prefixes don’t need TCAM, so they can be handled in CAM or in hash-table in RAM. Beware of caveats though, such as no ECMP for /32s.
Hardware table capacity - How many prefixes? Watch for caveats on ‘prefix distribution’ where route table capacity varies based on prefix lengths. What is the effect on capacity when you enable IPv6 or uRPF? Is the TCAM shared between Vlans/ACLS/PBR/indirect and direct routes? Does the system dynamically re-partition or do you hard-code and reboot (a la SDM profiles)?
Legacy hardware support – Can you mix old and new generation hardware in a single chassis? Will the system performance drop-down to match the lowest capacity line card? For a chassis-based system, it’s probably best to have things consistently bad rather than having intermittent issues when you hit the limits of the lower capacity linecard.
ECMP / LAG size – How efficient is the ECMP/LAG hashing? A low number of hash buckets will overload some of your links if you fail back to a non power-of-two bundle size.
Transceivers - What transceivers are supported, which are not supported? Are 3rd party transceivers possible? Your vendor will not enjoy this question as they make a large margin on re-badged optics, but that shouldn’t be your concern.
Software questions and tips
Look closely at the feature roadmap - Resist the temptation to rush this slide. It’s easy to assume which features an entry-level product should have, but you’ll often see these basic features listed on the ‘next release’ slide.
Real world exposure - If there is a key feature you want, how long has it been baked-in by real customers? How long ago was the code shipped, and how many customers with networks like yours are using the features?
Proprietary features - This is not always a bad thing, but you need to clarify the open or proprietary status of each feature. The vendors are working to solve problems and innovate to provide new features. If no open standard alternative exists then you should investigate, but committing to a proprietary feature is a strategic and potentially costly decision. Push for more details if the vendor has ‘conditional compliance’ to a standard.
Licensing – Get a list of all licenses available for your solution and a full breakdown of what each license covers does. Closely examine all of these licenses, as you’ll often find key features in the advance licenses which you were assuming were present in the base license. Watch out for new generations of hardware which prompt a change in the licensing model.
Are there any restrictions or caveats on that performance figure? - You’re looking for hidden resource usage penalties. You can hold 16K ACEs in your ACL TCAM but this is 8K unique entries, mirrored across two slices to avoid outages during ACL application. Also, if your ACL is ‘x’ lines long, you may need to burn ‘x’ ACL TCAM slots for every interface you apply the ACL on.
What important questions do you feel are missing from this list? Add your voice to the comments.