Sizing workloads for your data center can be tricky because you often don’t know how big the environment needs to be until after you’ve deployed it into production. (Unless you have a positronic brain, which we certainly don’t have.) This situation can lead to many face palms.
To help you stop abusing your face, the Datanauts are dedicating the latest episode of the Silo Series to discuss strategies and tactics for resource allocation in virtualized environments.
We’ll dig into issues around memory and CPU allocation and offer tips for how to meet performance requirements without wasting resources. We’ll explore why metrics and telemetry are important for resource allocation, and why you want multiple data sources.
And we’ll look at how to handle big, beefy applications, and speculate about where we’re going with containers and unikernels.
This episode of Datanauts is brought to you by ITProTV. Enhance your technology aptitude. ITProTVis the resource to keep your I.T. skills up to date, with engaging and informative video tutorials. For a free 7-day trial and 30% off the life of your account, go to itpro.tv/datanauts and use the code DATANAUTS30.
The Datanauts are sponsored by Altaro Software, developers of virtual backup trusted by over 30,000 SMBs. If you need an easy-to-use and affordable Hyper-V and VMware backup solution, try Altaro VM Backup for 30 days. Visit go.altaro.com/datanauts/ and throughout the month of June Datanauts listeners will get a free Altaro t-shirt. Plus, after the 30-day trial you can back up 2 VMs for free, forever!
Part 1 – Allocation of Resources
- Ethan’s questions
- How much memory to allocate?
- How many CPU cores?
- How much memory is dedicated?
- How much can be oversubscribed?
- How can I guarantee a specific performance level?
- But I don’t want to give TOO much to the VM and waste resources…
- Vendors list specific requirements, but are they too greedy?
- Tooling is important
- Capturing metrics is important to gain a picture of the data center
- We talked about data center telemetry in the Snap episode. Virtualized workloads are no different – you need data on how they are performing
- It’s wise to avoid relying on a single source of truth, such as vCenter, for all metrics.
- Be a hardass
- Start with unrealistic resource commitments.
- Train people to convince you that they NEED more.
- Be a gatekeeper of the data center and its resources.
- It’s nigh impossible to get resources back
- Use allocations, not hardware details, as metrics for tenants / business units / app teams. “You have X cores and X GB of RAM” not “you have 5 hosts with these specs to consume”
- Abstract, pool, automate.
Part 2 – Bigger, Beefier Apps (Advanced)
- Moving data around is often the critical element to designing for a critical application
- This could be the disk subsystem or the network fabric, depending on how storage has been provisioned
- You must meet peak demands for this app and all others within the cluster.
- Having dynamic provisioning of resources is extremely helpful
- Performance issues often boil down to how an application talks to its disk(s)
- Queues are everywhere.
- Queue depth is the number of I/O requests (SCSI commands) that can be queued at one time on a storage controller. Each I/O request from the host’s initiator HBA to the storage controller’s target adapter consumes a queue entry.
- Think of it like a credit card – you can spend up to the limit, but eventually you have to pay off the balance.
- Queue depth has a direct correlation to IOPS
- Queue size + latency = IOPS
- Storage IO Control (SIOC) can help maintain fairness for a single storage device across multiple hosts. Does not do anything across storage devices. Does this by controlling the queue. Driven by perceived latency as a trigger.
- Command latency is easy to view for virtualized workloads
- GAVG/cmd, KAVG/cmd, and DAVG/cmd (Guest, Kernel, and Device)
- There’s also Worlds, Adapters, Storage Array. These all add up.
- Guest OS > VMM > vSCSI layer > ESXi Storage Stack > Driver > HBA > Fabric > Storage Array
- Type of IO patterns matter
- reads, writes, both
- sequential, random, both.
- Block size (4k, 8k, 64k)
- Other big gotcha is NUMA (non uniform memory access)
- Non uniform meaning we don’t uniformly use memory across the host, we use memory that is local to the CPU running the execution threads for the workload.
- Only for SMP (Symmetric multiprocessing) hosts
- Otherwise, if separated, compute has to fetch RAM information across a bus.
- Locality is the measure of how much local RAM is being used vs elsewhere.
- This plays into sizing.
Part 3 – Future Thoughts
- Today, we hit “state of the art” – virtual machines, probably in a VMware environment.
- Big, fat VMs. Full operating systems with application services layered on top.
- Tomorrow, containers will be the norm. And maybe unikernels.
- These have much smaller resource requirements per container vs. per VM.
- But there is still a sharing of resources and therefore resource contention to deal with.
- The orchestrated way to deal with this seems to be monitoring telemetry data for exceeded thresholds of specific microservices and then spinning up more containers of that process…which doesn’t seem very nuanced at this point. A bit of a blunt hammer.
- But if x86 is cheap, do we care? A brave new tomorrow.