The following piece originally appeared in Human Infrastructure Magazine, a twice-monthly Packet Pushers newsletter. Get a free subscription by becoming a Packet Pushers member.
Ten years of uptime is often a badge of honor for equipment. Ten years of never losing power. Ten years of software that didn’t crash. Wow! Engineers ponder the odds of such an event and nod with mildly raised eyebrows at the feat.
I’m here to argue that ten years of uptime isn’t such a good thing.* Let’s think this through.
Code stable enough to run for a decade without crashing is impressive. But it’s also at least a decade old. That means that software is almost certainly a security risk to the organization.
The fact that the device is up doesn’t mean that the system hasn’t been compromised. It just means that no one’s noticed.
Lack Of Progress
I like being a part of organizations that keep up with technology, invest in their future, and adapt to modern business practices.
A ten year old system that, presumably, is still in use (no guarantee there, I realize) might indicate a penny-pinching or regulatory mindset that would rather use technology far past its shelf life than keep up. (Hugs all around to the folks supporting IT in the medical field.)
Falling behind means an organization is also accumulating technical debt that becomes increasingly hard to pay off.
When Julie, the engineer that installed that system a decade ago, finally leaves the company, who’s going to work on it? Will the organization put an ad out, seeking an engineer who’s really good at old tech? Good luck with that.
A lack of technological progress is sometimes paired with ancient operational process. It could be that the decade-old system is there because the IT team finds it too hard to make the changes required to get rid of it.
A lack of change perpetuates the old system. Hmm. This could indicate a problem with the organization itself.
Perhaps an IT manager lives in a culture of fear, and won’t risk change for fear of failing because failure isn’t tolerated. Perhaps a systems engineer knows the system should be replaced, but has been told “no” so many times that bringing up the issue seems pointless.
Why not just leave the ancient router in the corner doing its thing? Sure, it can’t be covered by a support contract anymore, but hey, it’s run for decade. Maybe it will run for a decade more. Tick, tick, tick…
A piece of gear that’s been up for ten years might mean that it’s such a critical part of the infrastructure that no one dares to take it down. Taking it down for maintenance would be too risky.
We’ve all heard the argument, “If we shut it down, we aren’t sure it will come back up,” as if the system is kept alive by the sheer rotational velocity of its spinning disks.
This attitude indicates a bad design that doesn’t tolerate downtime. Back in the day of uber-expensive, gold-plated infrastructure devices, compromising on availability was a requirement. Many inadequate designs are created because of budget constraints.
However, in modern IT design, broken infrastructure is assumed. While enterprises still have a penchant for buying gold-plated infrastructure, the idea on the rise is cheap, disposable equipment that’s obtained at high value for dollar and scaled out (or thrown out) as quickly as capacity demands.
Nothing should be gold-plated in IT infrastructure anymore. Buy it cheap, design it to tolerate failure, and replace it often.
*I know, I know. Ten years of uptime is still sort of cool. I know. I have that screenshot, too. Somewhere.