One of networking’s great joys is that of version selection. What version of software do you run on the devices you are responsible for? The choice is generally a trade-off between two things: features and stability.
By “features”, we mean more nifty things that the software can do. For example, as you monitor the progress of IOS commands that are VRF-aware, you’ll notice that the newer the IOS version, the more likely you are going to be able to source traffic like NTP, SNMP, TACACS, etc. from a specific VRF. Older versions are hit or miss.
By “stability”, we mean the less likely the software is to crash or traceback (a fatal processing exception that potentially results in an endless variety of business-impacting horrors), experience a CPU hog (a process that flatlines the CPU at 100% utilization, starving other processes), leak memory (where memory is used but never given back to the available memory pool, eventually causing a device crash or other major issue), or be vulnerable to a severe security flaw (a remotely exploitable problem that could cause confidential information leakage or a denial of service).
Newer versions tend to offer more features, but might be less stable. Older versions tend to have fewer features, but offer more stability. Why? Because the best quality assurance testing a vendor has is YOU. While every vendor has a QA process they undergo before releasing software to their customers, no vendor seems able to replicate the broad swath of real-world scenarios the software will actually operate within. Your network environment is different than mine, which is different from everyone else’s, which is most assuredly different from a QA lab. The only way all the bugs are going to be revealed is when the software is released. That’s why, philosophically, I try to avoid any software version ending in “.0”. Dot zero is a secret code from the vendor to you that means, “Yeah, we QA’ed it, but your risk is above average.” I prefer to wait until at least “.1”, as my interpretation of dot one is that a bunch of poor guys using the dot zero release will be able to close their support cases once they upgrade.
Sometimes, it’s just not reasonable to stay with an older version of software. You need that newer, cutting-edge version, because there’s a feature in there that your business requires. After all, that tends to be what drives the move to newer software versions: they contain solutions to solve business problems or keep up with technology trends that businesses wish to leverage. Or put cynically, to solve problems created by previous feature releases, perpetuating the endless upgrade cycle. (Why yes – yes I do feel like a hamster on a wheel.) 😉
One way to calculate your risk on a new version is by doing a process I think of as the bug scrub. In the Cisco world, there’s two pieces to this:
- One piece is to read the release notes for your software version carefully. Release notes very often contain sections entitled “fixed issues” (the stuff that was broken in previous releases that the vendor has resolved) and “known issues” (the stuff that the vendor knows is still broken). Not reading the release notes is really an abdication of responsibility. You simply must read them – what’s in there could save your bacon, and saving bacon is important. If there was a “Save the Bacon Coalition”, I’d join it, and even chair the local chapter. And so would you, as not all software upgrades are created equal. While many Cisco upgrades are as routine as an “upload and reload”, many others are significant paradigm shifts in the way the software works, requiring procedural and/or code changes in your device. For instance, if you recently updated from Cisco ASA 8.x to the 8.3 or 8.4 code families without reading the release notes (or the special migration guide), you’re probably unsure if that was a bus or tractor trailer that hit you. Release notes clue you in to the stuff you care about or do not care about. Not all “known issues” will be relevant to your networking environment, as you probably do not use all of the features any given software offers. For example, why would you care about an obscure MPLS bug, if you aren’t an MPLS shop?
- A second piece is that of searching the Cisco bug toolkit for known issues that haven’t yet made it into release notes, presumably because they were discovered by customers after the software was officially released. This is an admittedly tedious process, and I further admit that why the bug toolkit search engine shows me some of the results that it does is a bit mystifying. It took me a solid hour of reading and reviewing bugs before I finally found a match on an ASA problem I had yesterday, in part because no matter how much I tried to pre-screen and qualify my search query, I kept getting results back that weren’t obviously connected. That said, searching bug toolkit is far better than it used to be “back in the day”.
