Since it has been out for more than a year, and has been developed and improved tremendously during that time, I decided to finally take the plunge and buy a year’s subscription to the Cisco VIRL software.
Part 1: Comparing and Tweaking VIRL
Until now, I have been using any combination of real hardware, CSR1000Vs, and IOL instances for studying and proof of concept testing.
My first impression of VIRL is that it is a BEAST of a VM with regards to CPU and RAM consumption. I installed it on my 16GB MacBook Pro first, and allocated 8GB to it. However, its use was very limited as I was unable to load more than a few nodes. I then moved it to my ESXi server, which is definitely more appropriate for this software in its current state.
I knew that the CSR1000Vs were fairly RAM hungry, but at the same time they are meant to be production routers, so that’s definitely a fair tradeoff for good performance. The IOSv nodes, while they do take up substantially less RAM, are still surprisingly resource intensive, especially with regards to CPU usage. I thought the IOSv nodes were going to be very similar to IOL nodes with regards to resource usage, but unfortunately, that is not yet the case.
I can run several tens of instances of IOL nodes on my MacBook Pro, and have all of them up and running in less than a minute, all in a VM with only 4GB of RAM. That is certainly not the case with IOSv. Even after getting the VIRL VM on ESXi tweaked, it still takes about two minutes for the IOSv instances to come up. Reloading (or doing a configure replace) on IOL takes seconds, whereas IOSv still takes about a minute or more. I know that in the grand scheme of things, a couple of minutes isn’t a big deal, especially if you compare it to reloading an actual physical router or switch, but it was still very surprising to me to see just how much of a performance and resource usage gap there is between IOL and IOSv.
Using all default settings, my experience of running VIRL on ESXi (after going through the lengthy install process) was better than on the MBP, but still not as good as I thought it should have been. The ESXi server I installed VIRL on has two Xeon E5520 CPUs, which are Nehalem chips that are each quad core with eight threads. The system also has 48GB of RAM. I have a few other VMs running that collectively use very little CPU during normal usage, and about 24 GB of RAM, leaving 24 GB for VIRL. I allocated 20GB to VIRL, and placed the VM on an SSD.
The largest share of CPU usage comes from booting the IOSv instances (and maybe the other node types as well). The issue is that upon every boot, a crypto process is run and the IOS image is verified. This pegs the CPU at 100% until the process completes. This is what contributes the most to the amount of time the IOSv node takes to finish booting, I believe. This may be improved quite a bit in newer generation CPUs.
When I first started, I assigned four cores to the VIRL VM. The IOSv instances would take 5-10 minutes to boot. Performing a configure replace took a minimum of five minutes. That was definitely unacceptable, especially when compared to the mere seconds of time it takes for IOL to do the same thing. I performed a few web searches and found some different things to try.
The first thing I did was increase the core count to eight. Since my server only has eight actual cores, I was a little hesitant to do this because of the other VMs I am running, but here is a case where I think HyperThreading may make a difference, since ESXi sees 16 logical cores. After setting the VM to eight cores, I noticed quite a big difference, and my other VMs did not appear to suffer from it. I then read another tweak about assigning proper affinity to the VM. Originally, the VM was presented with eight single-core CPUs. I then tried allocating it as a single eight-core CPU. The performance increased a little bit. I then allocated it properly as two quad-core CPUs (matching reality), and this was where I saw the biggest performance increase with regards to both boot time and overall responsiveness.
My ESXi server has eight cores running at 2.27 GHz each, and VMware sees an aggregate of 18.13 GHz. So, another tweak I performed was to set the VM CPU limit to 16 GHz, so that it could no longer take over the entire server. I also configured the memory so that it could not overcommit. It will not use more than the 20GB I have allocated to it. In the near future, I intend to upgrade my server from 48GB to 96GB, so that I can allocate 64GB to VIRL (it is going to be necessary when I start studying service provider topologies using XRv).
I should clarify and say that it still doesn’t run as well as I think it should, but it is definitely better after tweaking these settings. The Intel Xeon E5520 CPUs that are running in my server were released in the first quarter of 2009. That is seven years ago, as of this writing. A LOT of improvements have been baked into Xeon CPUs since that time, so I have no doubt that much of the slowness I experienced would be alleviated with newer-generation CPUs.
I read a comment that said passing the CCIE lab was easier than getting VIRL set up on ESXi. I assure you, that is not the case. The VIRL team has great documentation on the initial ESXi setup, and with regards to that, it worked as it should have without anything extra from their instructions. However, as this post demonstrates, extra tweaks are needed to tune VIRL to your system. It is not a point-and-click install, but you don’t need to study for hundreds of hours to pass the installation, either.
VIRL is quite complex and has a lot of different components. It is expected that complex software needs to be tuned to your environment, as there is no way for them to plan in advance a turnkey solution for all environments. Reading over past comments from others, VIRL has improved quite dramatically in the past year, and I expect it will continue to do so, which will most likely include both increased performance and ease of deployment.
Part 2: INE’s CCIE RSv5 Topology on VIRL
After getting VIRL set up and tweaked to my particular environment, my next step is to set up INE’s CCIE RSv5 topology, as this is what I will be using VIRL for the most, initially.
I was satisfied with using IOL, but I decided to give VIRL a try because it not only has the latest versions of IOS included, it has many other features that IOL in itself isn’t going to give you. For example, VIRL includes visualization and automatic configuration options, as well as other features like NX-OSv. I was particularly interested in NX-OSv since I have also been branching out into datacenter technologies lately, and my company will be migrating a portion of our network to the Nexus platform next year. At this point in time, NX-OSv is still quite limited, and doesn’t include many of the fancier features of the Nexus platform such as vPC, but it is still a good starting point to familiarize yourself with the NX-OS environment and how its basic operation compares to traditional Cisco IOS. Likewise, I intend to study service provider technologies, and it is nice to have XRv.
I configured the INE ATC topology of 10 IOSv routers connected to a single unmanaged switch node. I then added four IOSv-L2 nodes, with SW1 connecting to the unmanaged switch node, and then the remaining three L2 nodes interconnected to each other according to the INE diagram. The interface numbering scheme had to change, though. F0/23 – 24 became g1/0 – 1, f0/19 – 20 became g2/0 – 1, and f0/21 – 22 became g3/0 – 1.
I built this topology and used it as the baseline as I was testing and tweaking the VIRL VM, as described in Part 1. I was familiar with how the topology behaved in IOL, as well as with using CSR1000Vs and actual Catalyst 3560s, and that was my initial comparison. After getting things to an acceptable performance level (e.g. ready to be used for studying), I realized I needed a way to get the INE initial configurations into the routers, and I would prefer to not have to copy and paste the configs for each device for each lab every time I wanted to reload or change labs.
One of the issues I experienced with VIRL is that nothing is saved when nodes are powered off. If you stop the simulation completely, the next time you start it, everything is rebuilt from scratch. If you stop the node itself, and then restart it, all configurations and files are lost. There is a snapshot system built in to the web interface, but it is not very intuitive at this point in time. Likewise, you have the option of extracting the current running configurations when the nodes are stopped, but this does not include anything saved on the virtual flash disks. Some people prefer having a separate VIRL topology file for each separate configuration, but I find it to be more practical (and faster) to use the configure replace option within the existing topology to load the configurations.
Luckily, the filesystem on the VIRL host VM is not going to change between simulations, and all of the nodes have a built-in method of communicating with the host. This makes it an ideal place to store the configuration files. I went through and modified the initial configurations to match the connections in my VIRL topology. You can download the VIRL topology and matching INE configurations I assembled here. For the routers, this included replacing every instance of GigabitEthernet1 with GigabitEthernet0/1. The switch configs were a little more involved and required manual editing, but there’s not nearly as many switch configurations as there are router configurations. After getting the configuration files in order, I used SCP to copy the tar files to the VIRL VM using its external-facing (LAN) IP address. I placed the files into /home/virl/.
Originally, I added an L2-External-Flat node to match every router and switch in the topology so that each node could communicate with the VIRL host VM. However, someone pointed out to me that there was a much easier way to do this: click the background of the topology (in design mode), select the “Properties” pane, then change the “Management Network” setting to “Shared flat network” under the “Topology” leaf. This will set the GigabitEthernet0/0 interfaces to receive an IP address via DHCP in the 172.16.1.0/24 range, by default. This setting only applied to the router nodes when I tried it, so I still had to manually edit the configurations of the IOSv-L2 switch nodes.
For the four IOSv-L2 switch node instances, I used this configuration:
interface GigabitEthernet0/0 no switchport ip address dhcp negotiation auto
It is very important to note that when you convert the port to a routed port on IOSv-L2, you need to remove the media-type rj45 command. This does not need to be done on the IOSv routers, though. The IOSv nodes were configured as:
interface GigabitEthernet0/0 ip address dhcp duplex auto speed auto media-type rj45
My original intention was to modify the initial startup configurations of the nodes to automatically copy the files to their virtual flash drives upon each boot via TFTP, but the issue I ran into was that the interfaces would remain shutdown until the configuration was completed. So even though I was able to put the necessary commands into the config (prefixed with “do”), the commands wouldn’t work from that point because the interfaces were automatically shutdown. However, placing other EXEC-mode commands at the end of the configurations (before the end line), such as do term len 0, may save you some extra steps when you start labbing.
Originally, I was planning to just use SCP to copy the files from the VIRL VM host to the nodes, but there is no way to specify the password within the command – the password prompt is always separate, so it is unusable as part of the configuration.
This led me to configure the VIRL VM as a TFTP server to access the configuration files. I modified some of the information as detailed on this site and performed these steps on the VIRL VM:
sudo su apt-get update && apt-get install -y tftpd-hpa vi /etc/default/tftpd-hpa
Modify the file as follows:
TFTP_USERNAME="virl" TFTP_DIRECTORY="/home/virl" TFTP_ADDRESS="0.0.0.0:69" TFTP_OPTIONS="--secure --create"
And finally, restart the TFTP service to make the changes take effect:
service tftpd-hpa restart
However, I set up the TFTP server before discovering that placing the copy commands in the startup config was useless. So, setting up the VIRL VM as a TFTP server is an optional step; I just decided to stick with it because I’m used to using TFTP.
At this point, the configuration files are in placed on the VIRL host VM, and after booting the nodes in the topology, there is connectivity between the nodes and VIRL to copy the files.
If you perform a dir flash0: from any of the nodes, you will see that there is quite a bit of free space on that filesystem. However, I found out in attempting to copy files to it that it does not actually let you use that space. It would not let me copy all of the configuration files to it. Thankfully, the flash3: filesystem does.
Assuming you placed the tar files directly in /home/virl on the VM, the following commands will copy the configuration files to your node:
archive tar /xtract scp://firstname.lastname@example.org/R1.tar flash3:
The default password is all uppercase, VIRL
archive tar /xtract tftp://172.16.1.254/R1.tar flash3:
Replace “R1” with the actual device you’re copying the files to. With the default settings, the VIRL host VM will always be 172.16.1.254.
After all the files are copied and extracted to your devices, you can use the Terminal Multiplexer inside the VM Maestro interface to issue a command such as this to all of the devices simultaneously:
configure replace flash3:basic.eigrp.routing.cfg force
So far, I have not yet had great luck with the IOSv-L2 instances. They were released as part of VIRL not too long ago, and have been making improvements with time. However, for studying for the CCIE R&S at this point in time, I will probably stick with the four 3560s in my home lab and bridge them to the IOSv router nodes.
VIRL is a pretty complex software package with lots of individual components. I’ve only had the software for a few days, so I haven’t had time yet to do a really deep dive, and there are probably even better ways to do some of the things I’ve described here. I wish it could be as fast as IOL, but by comparison, that really is the only major disadvantage of using VIRL instead of IOL. There are so many other features that do make the software worth it, though, in my opinion.
In years past, CCIE R&S candidates were known to spend thousands of dollars on equipment for a home lab. We are lucky enough today that computing power has caught up to the point where that is no longer the case. If you’re in the same boat that I am currently in, where your employer is not paying for training, then VIRL is a pretty good investment in your career. But of course, like anything else, you’ll only get out of it what you put into it. It’s not some kind of magic pill that will instantly make you a Cisco god, but it definitely has awesome potential for the realm of studying and quick proof-of-concept testing.
I had the opportunity to try out VIRL on a newer system to see how it performed compared to my home ESXi server. The newer system has two Xeon E5-2620v2 CPUs (of which I assigned all twelve physical cores to VIRL). I also assigned 32GB of RAM. Storage is provided by an 8Gb FibreChannel SAN. The overall responsiveness and time to boot the nodes is improved, but not as much as I expected. However, one step that is markedly improved with the newer generation CPUs is the post-boot image verification. With this configuration, booting a topology of ten IOSv instances still takes a few minutes, but overall it does feel faster than with the older Nehalem CPUs in my home ESXi server.