Recently I saw the question posed at Network Field Day about why LTE providers can deliver bandwidth to millions of users and “it just works” while delivering bandwidth on the enterprise LAN is still problematic.
It’s a great question and one that is not asked often enough when comparing the way ISPs build networks to that of a typical enterprise.
While there is no single answer, there are lessons to be learned by highlighting the differences in the way network engineers design, operate, and problem solve between the two environments. I’ve worked both as a Telco and large enterprise engineer and I am now a consulting network architect for ISPs and enterprises globally – which drives me to always compare and contrast these two worlds. This article begins with design complexity, and is the first in a multi-part series to highlight some of the differences and lessons learned between ISP and enterprise network engineering.
Design: Simplicity Vs. Complexity
The Enterprise Way:
Design simplicity as a stated goal often hamstrings most enterprises.
Business requirements are rarely simple and the larger the organization is, the more complex and intertwined it becomes. Throw in a merger or two and more than one continent into the network, and your Cisco-validated design just went out the window.
Enterprises often lean heavily on their vendors when designing because of a culture of blame shifting. It’s easy for IT leadership to say “Well if the vendor can’t get it right, why should we be expected to?” This leads to an obsessive focus on design simplicity for a few reasons.
First , it’s easier for the vendor because there is a preference by pre-sales engineers to default to the latest design template. While this is great for the vendor, it’s often just the tip of the iceberg for the organization’s IT needs.
Second, IT leadership feels obligated to understand how a design works instead of why it’s being deployed and what problems it solves. If the design feels too complex to the people that have to sign off and fund it, they might push for a simpler design because simplicity is good right?
Third, organizations demand design simplicity to ensure that the network engineering team can digest and manage the end result.
While this is a valid argument, it takes the network to a dangerous place. Now, instead of using features, protocols, and design elements that are intended to solve certain problems, you’ve artificially restricted the network, so it fails to execute.
Those pesky business problems just won’t go away, so the simple network gets a litany of bolt-ons and band aids that eventually results in *gasp* complexity–and not the good kind of complexity either.
The Service Provider Way:
Enterprise networks exist only to sell the product or service for the business. By contrast, a service provider’s network is the product, so a provider puts a great deal of effort into making it perform as fast and reliably as possible. This isn’t simple.
Service providers embrace a design approach I call “good” complexity. As compared to “bad” complexity, where the network suffers from one-off solutions to problems, good complexity embraces the idea that any amount of complexity is justified and necessary if it is shown to be stable, scalable and repeatable while solving the required problems.
This is why ISP network designs are covered in MPLS, VRFs, MP-BGP, QinQ, automation, and orchestration: because they actually solve very real problems and work together in harmony (most of the time). Even more to the point, you have to be able to immediately solve problems in an elegant way for new and unexpected use cases.
Service provider design leans primarily on the experience and guidance of the network engineers and architects that plan and operate the network.
This isn’t to say that vendors don’t have a role in the design process, but vendors have to be prepared to defend whatever best practice designs they bring to the table and guarantee that it will be interoperable.
Multi-vendor versus single vendor plays a role here as well because enterprises tend to be more single vendor for route/switch, whereas ISPs are almost always multi-vendor. This means that only the network engineering team really understands how everything fits together. A single vendor can’t (and usually won’t) do it for you.
Designs and solutions in the enterprise are often tailored to the experience level of the team that has to support it, which results in a preference for simpler designs. However, the reverse is true in the service provider world. Essentially, when learning how to build and operate ISP networks, you adapt or die, regardless of how complex it is.
And to a certain degree, it has to be that way because some ISPs support critical life-saving services like 911 (112 for Europe) as well as hospitals and police departments so the consequences for network downtime go well beyond lost revenue.
The level of complexity is not tailored to the sum experience of the engineers but rather to the requirements for the services provided.
Because of these factors, service providers tend to push engineering teams to learn the network inside and out and be able to troubleshoot it effectively at every layer. However, most providers focus on Layers 1-4, whereas the enterprise spends the vast majority of time on layers 4-7.
This focus on applications and layers 4-7 within the enterprise drives the split between engineering skill sets. The net result of this process is that in general, ISP network engineers end up with a strong grasp on the physical layer as well as route/switch, not because they are any smarter than enterprise engineers, but because it’s part of daily operations. Enterprise engineers typically become consumed with areas of IT other than network fundamentals on a daily basis.
Based on this early foundation and daily focus on advanced route/switch even with more junior engineers, it’s typically easier as a team to deploy complex services like MPLS, MP-BGP, VRF, Metro Ethernet, etc. In the end, all of this creates a culture where ISP engineering shops are more likely and even encouraged to consider and test technologies for a potential design based on the merits of the end result – even if this means deploying something nobody has tried before.
1. Don’t Be Afraid To Try New Things
Until a design is finalized, there are no wrong answers. Before I became a consultant, I worked as the team lead and architect for a large global enterprise and one of the design lessons I always tried to impart to the engineering team was to solve the problem at hand with the technologies that are available, regardless of whether or not it’s considered an “enterprise” solution.
Too often, those of us in the enterprise get consumed with whether a technology is suitable for the enterprise based on complexity, vendor guidelines, and documentation, not to mention the fear of something new. We take somebody else’s word for it instead of doing our own evaluation and testing. This reliance on the vendor to tell us what is acceptable for use in the data center is a large part of why enterprise networks struggle to grow and implement real solutions instead of band aids.
This is one of the key areas where enterprises can take a page from the ISP playbook. Namely, to get in the habit of testing and validating potential solutions with a less biased approach that accounts for whether the protocol solves the business problem and can be deployed in a stable and repeatable fashion.
One of the best examples of this is the use of BGP inside the data center. Ten years ago, you’d be laughed out of the room for suggesting the use of BGP inside the data center, but now it’s fairly common. BGP hasn’t really changed, but a few daring thinkers started pushing the boundaries of what was acceptable and then published use cases for BGP in the data center, and now it’s increasingly more common to find BGP inside the enterprise data center. The same applies to technologies like VRFs and MPLS which have found their way into forward thinking enterprises.
2. Fight For Your Right To Cable, Route & Switch
When you are knee deep in enterprise woes like app troubleshooting, project overload, and politics with a consistent 40+ hour week, it’s easy to let the core skill sets (physical layer, routing and switching) degrade.
I’ve always felt this was the single most important area of networking for engineers to get right and continue professional development on, because none of the other stuff like servers, security, voice/video works if routing/switching and the physical layer aren’t working well. Going even further, it’s hard to digest unfamiliar network protocols or technologies without a strong L1-L3 foundation, and this gap affects design decisions.
When I first crossed over from ISP into the enterprise, I thought I was going to go insane because all my time was spent on things unrelated to building the network. As I adjusted and got comfortable with the way enterprise IT worked, I began to notice skill fade in L1-L3 and made a plan to stop it.
First, I spent some time every week in GNS3 testing new technologies I was interested in – often when stuck on an hours-long conference call.
Second, I advocated for a physical lab so that equipment could be pushed to the limits and broken in a safe and creative environment. It sounds like a no-brainer, but many enterprises have no lab other than VMs to test equipment and provide engineers with experience in seeing things fail.
Last, you have to push yourself to continue professional development in L1-L3. This can be a very specific goal like the CCIE R&S or more piecemeal by seeking out new content online and applying it – part time consulting is a great way to achieve this and earn some extra money as well.
These observations really just scratch the surface of the vast difference in the way enterprises and ISPs think about and design networks. And it’s not completely one sided either – there is a wealth of information in application, virtualization, storage and security networking that ISP engineers can learn from their enterprise counterparts and use to design better ISP networks.
Hopefully it’s the beginning of more discussion between the two realms as we have a lot to learn from each other while we continue the daily grind of connecting the world’s people and businesses.