When researching data center network architectures, you will find the terms “scale out” and — rather less frequently — “scale up” used. What do these terms mean?
I’m going to discuss these terms in a networking sense. If you search, you’ll find that applications and storage also have concepts of scaling out vs. up. Those other areas use the terms in similar ways, although the specifics are different.
Scaling out = adding more components in parallel to spread out a load. Scaling up = making a component bigger or faster so that it can handle more load.
Scaling is growing an infrastructure (compute, storage, networking) larger so that the applications riding on that infrastructure can serve more people at a time. When architects talk about the ability of an infrastructure design to scale, we mean that the design as conceived is able to grow larger over time without a wholesale replacement. The terms “scale up” and “scale out” refer to the way in which the infrastructure is grown.
Scaling up is taking what you’ve got, and replacing it with something more powerful. From a networking perspective, this could be taking a 1GbE switch, and replacing it with a 10GbE switch. Same number of switchports, but the bandwidth has been scaled up via bigger pipes. The 1GbE bottleneck has been relieved by the 10GbE replacement.
Scaling up is a viable scaling solution until it is impossible to scale up individual components any larger. For example, 10GbE is a practical limit for uplinking hosts to the network until such time as 25GbE and higher ports are readily available on hosts. In that context, what happens when 10GbE is no longer enough bandwidth for the uplinked host? Rather than scaling up, you scale out.
For a visual reference of scaling up, picture in your mind something growing physically larger. It’s “The Blob” (sort of). The metaphor breaks down a bit, but you get the idea.
Scaling out takes the infrastructure you’ve got, and replicates it to work in parallel. This has the effect of increasing infrastructure capacity roughly linearly. Data centers often scale out using pods. Build a compute pod, spin up applications to use it, then scale out by building another pod to add capacity. Actual application performance may not be linear, as application architectures must be written to work effectively in a scale-out environment.
Application delivery controllers (A10 Networks, F5 Networks) are examples of networking tools that help with scaling out. ADCs host a virtual IP that is the front end to pool members (real servers) on the back end. As the demand for an application grows, the application can be scaled out by adding additional pool members behind the virtual IP.
Leaf-spine network architectures are also “scale out” designs. As a new pod or rack is installed, top of rack leaf switches plumbed to the spine layer add capacity.
For a visual reference of scaling out, picture in your mind something copying itself. Like…oh, I don’t know…Tribbles! Yeah. That’s it.