This is Part 1 in a special series looking at the inside of your network device. Although software will be at heart of network innovation, it will still run on hardware and it’s time to expose the internals of our network hardware and understand the hardware architecture inside a typical device. Many people are surprised to find that CPUs, memory, storage and buses are similar to computers while the forwarding engines are rather spectacularly different.
Thanks to our guests for working very hard to bring this show to you.
Why study hardware?
- Leads to better engineering & purchasing decisions (foresight and BS detection)
- Quicker troubleshooting – How can a busy egress line card lead to OSPF neighbors dropping on a lightly used line card?
- Nerd-tastic – Some of the tech is just really really cool. Maybe interesting to listeners.
High level types of box (four quadrants)
- Switches vs routers – back in the mists of time this was a strict technical delineation (L2 vs L3 forwarding), now more of a diffuse set of product expectations (rigidity/scope of forwarding features/scale vs price/density) that tends to mean switches are likely to have fixed packet processing silicon and routers likely to have more highly programmable packet processing silicon. (But with exceptions)
- Centralized vs distributed – normally the single biggest predictor of overall system complexity. Single pizza box much simpler/cheaper, but active line cards around a switch fabric scales far higher.
Pizza Box – components:
- CPU – Routing protocols and FIB programming, mgmt & services, exception packets and frames (not too many we hope), and system management,
Packet Processing Engines:
General points –
- Limited budget in terms of transistor density (heat, cost, signal integrity) – tradeoffs:
- Specialise e.g. ASIC and you lose flexibility but gain much higher density vs FPGA
- Process (or die Shink) – e.g. 65nm, 40nm, and 28nm futures – basically how silicon follows Moores law.
- Moore’s Law indirectly leads to greater industry conformity over time (everyone is producing fewer chips per year, because it takes ever longer to “paint” the increasing “transistor canvas”)
Lots of points along the spectrum of trading flexibility with performance/density for specfic use cases
- CPU – fine for certain functions, but for many mainline packet forwarding functions will easily be 2–3 orders of magnitude worse than dedicated silicon in performance per watt or area. Industry getting better at using this effectively where it makes sense (eg DPDK)
- Fixed-function silicon (“ASIC”)- (on the smallest pizza boxes just one chip, eg Broadcom Trident, Cisco Monticello, Fulcrum, Marvell, Mellanox.) – Fully featured but inflexible.
- Programmable silicon (“NPU”) – eg Juniper Trio, EZchip, Cisco nPower – trades raw density/throughput for greater flexibility. Run microcode. Many variants – sometimes these are arrays of parallel packet processing engines.
- FPGA – Xilinx and Altera – Powerful and flexible but at at a majo dollar $ and power/heat cost. Complex to program. Programmed once using Verilog/VHDL.
- Others – eg multicore CPU with hardware assist for crypto/DPI/etc – eg Cavium (Open Compute 40G L7 processor Engine)