So there I was, a shiny new OpenStack Folsom install (via DevStack) on a single server, and everything seems to be working — except I can’t reach the darn VMs I’ve spawned! I had a vague idea of the network topology the installer had created (driven by the choice to use the new Quantum networking service over the prior built-in networking provided by the base Nova service), but really had no idea how to go about troubleshooting this new virtual network that had been instantiated. Off to Google I went to search for answers, and that began the journey down this particular rabbit hole…
Let me back up a bit… I’ve been hearing about this “blah blah Cloud” thing (as our host Greg likes to put it) for quite some time, and had a basic understanding of what it was and what it could be good for. I work for the American R&D lab of a large multinational computer company headquartered in Japan, and I knew some of the research folks here had been investigating various cloud-related technologies, and had even used Amazon’s AWS in various projects. Then (for a short period of time a few years ago) we had our own “internal cloud” – one of the research dept. sysadmins had constructed a multi-node AWS clone, based on the open-source Eucalyptus project. Unfortunately, use was light, and it was eventually abandoned since the hardware used was aging, and the aforementioned admin had moved on.
Fast forward to about a week ago. Another researcher came up to me, and asked if her dept. could use the OpenStack implementation that we ran. “Huh?” I replied, “We don’t have any OpenStack implementation here.” (Perhaps she was thinking of the prior Eucalyptus setup.) Turns out that one of their research clients had asked them about what we had around it, and they were scrambling to get a research implementation available. Having been interested in the area of virtualization for some time, and exploring some of the underpinning technologies of OpenStack (namely KVM and Open vSwitch), I jumped at the chance to get involved. So luckily enough, they had a large server available that was bought to run VMware on, but had never gotten installed. So after a few hours of hardware reconfiguration and Linux installation (Ubuntu 12.04 in this case), we were ready to begin. Since it was only a single-server proof-of-concept, DevStack was the perfect fit, and it really couldn’t be easier —
stacker@dev01$ sudo apt-get install git -y stacker@dev01$ git clone git://github.com/openstack-dev/devstack.git stacker@dev01$ cd devstack stacker@dev01$ vi localrc stacker@dev01$ ./stack.sh
(command outputs omitted for brevity)
localrc directives, I followed instructions included in a blog post by the always-amazing Brent Salisbury (of course the IP address ranges being modified for my organization) and some 257 seconds and a voluminous amount of screen output later, I had a running OpenStack installation. So, I promptly logged in, took a look around, and spun up a couple of VM instances. Upon trying to ping them however from the host running DevStack, my joy quickly faded. The VMs were being reported by the OpenStack web UI (known as “Horizon”) as well as the CLI management client (known as “nova”) as having a private IP address assigned. But, even though it looked like the correct routing was in place (DevStack in my install has a public IP range known as the “floating range” that the “br-ex” Open vSwitch has an interface address out of, and a RFC1918 private range that the VMs get addressed from, with
br-ex acting as the gateway to it according to the linux routing table) and appropriate security rules in place (the VMs are firewalled off from the public by default, and rules must be written to allow desired access), I could not reach the VMs. This put me in a bit of a quandary, as the only way to log in to the VMs is by certificate-based SSH. No network access, no VM netstack investigation. I suspected that the VMs may not have gotten their assigned addresses correctly from DHCP, even though the OpenStack system was reporting their assignment. But was it a routing issue of some sort, a DHCP server problem, or a VM addressing problem? I figured I’d have to do some network sniffing, and that meant that I had to figure out the topology first.
First let’s look at the host machine’s routing table:
stacker@dev01:~/devstack$ netstat -nr Kernel IP routing table Destination Gateway Genmask Flags MSS Window irtt Iface 0.0.0.0 22.214.171.124 0.0.0.0 UG 0 0 0 eth0 126.96.36.199 0.0.0.0 255.255.255.0 U 0 0 0 eth0 188.8.131.52 0.0.0.0 255.255.255.240 U 0 0 0 br-ex 169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth0 192.168.0.0 184.108.40.206 255.255.255.0 UG 0 0 0 br-ex
So I knew that the route to the private network that the VMs were on was thru the
br-ex interface (Open vSwitch [OVS] can be set up with an IP address and then appears in the linux routing table as an interface), but I also knew that OVS does not act as a L3 router itself. I further knew that the Quantum subsystem in OpenStack has the concept of routers, so there must be a router on the other side of the OVS bridge; I made the (correct) assumption that the “220.127.116.11” gateway address in the routing table above must be the Quantum router’s “external” interface. But what was the topology past that router?
I decided to take a look at the OVS “show” command, which lists all OVS switches and their ports. OVS also implements a “brcompat” module, which allows the user of the traditional linux
brctl commands, showing both OVS and any traditional linux bridges. The results surprised me:
stacker@dev01:~$ sudo ovs-vsctl show adee4e67-940e-44e5-a7fe-e74a41739d7d Bridge br-ex Port br-ex Interface br-ex type: internal Port "qg-5dd85641-81" Interface "qg-5dd85641-81" type: internal Bridge br-int Port "tap0b9f0940-2c" tag: 1 Interface "tap0b9f0940-2c" type: internal Port "qvoa0bca375-a9" tag: 1 Interface "qvoa0bca375-a9" Port "qr-fce20708-25" tag: 1 Interface "qr-fce20708-25" type: internal Port "qvo10d38cf7-7e" tag: 1 Interface "qvo10d38cf7-7e" Port br-int Interface br-int type: internal ovs_version: "1.4.0+build0" stacker@dev01:~$ brctl show bridge name bridge id STP enabled interfaces br-ex 0000.8ae69a3f4a46 no qg-5dd85641-81 br-int 0000.4a549f449943 no qr-fce20708-25 qvo10d38cf7-7e qvoa0bca375-a9 tap0b9f0940-2c qbr10d38cf7-7e 8000.4effc18aaf4d no qvb10d38cf7-7e vnet0 qbra0bca375-a9 8000.9ed1f7f181dc no qvba0bca375-a9 vnet1
What became immediately evident was that there were a number of interfaces, as well as two traditional linux bridges, all beginning with “q”, which I took to be an indicator that they were created and used by Quantum. On further investigation of my system combined with Googling how Quantum implements networks, a number of concepts were revealed to me, and the logical topology became clear, followed by how this was actually impemented in linux:
There were two new concepts to me that I had to digest:
- Linux network namespaces
- “veth” pairs
It turns out that the router, as well as the DHCP server, is assigned per tenant in the Quantum system. Since different tenants may utilize the same IP space behind their router, the router and the DHCP service for each network behind the router must be run in separate namespaces (which is like VRF on traditional routers) to prevent IP overlap clashes. The linux command to show these namespaces is
ip netns list. You may also run the traditional linux routing / bridging commands scoped to such namespace by using the
ip netns exec <namespace-id> <cmd> syntax. Here’s the output from my system, that has only one router with one network behind it:
stacker@dev01:~$ ip netns list qrouter-43ffd046-ca80-450c-b6aa-b3340a69ed3e qdhcp-487dbcfc-62fb-4b87-ad8c-8fa4c015ac0a stacker@dev01:~$ sudo ip netns exec qrouter-43ffd046-ca80-450c-b6aa-b3340a69ed3e netstat -nr Kernel IP routing table Destination Gateway Genmask Flags MSS Window irtt Iface 0.0.0.0 18.104.22.168 0.0.0.0 UG 0 0 0 qg-5dd85641-81 22.214.171.124 0.0.0.0 255.255.255.240 U 0 0 0 qg-5dd85641-81 192.168.0.0 0.0.0.0 255.255.255.0 U 0 0 0 qr-fce20708-25 stacker@dev01:~$ sudo ip netns exec qrouter-43ffd046-ca80-450c-b6aa-b3340a69ed3e ifconfig | egrep '(encap|addr)' | grep -v inet6 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 qg-5dd85641-81 Link encap:Ethernet HWaddr fa:16:3e:c3:9c:d9 inet addr:126.96.36.199 Bcast:188.8.131.52 Mask:255.255.255.240 qr-fce20708-25 Link encap:Ethernet HWaddr fa:16:3e:f2:e8:76 inet addr:192.168.0.1 Bcast:192.168.0.255 Mask:255.255.255.0 stacker@dev01:~$ sudo ip netns exec qdhcp-487dbcfc-62fb-4b87-ad8c-8fa4c015ac0a netstat -nr Kernel IP routing table Destination Gateway Genmask Flags MSS Window irtt Iface 192.168.0.0 0.0.0.0 255.255.255.0 U 0 0 0 tap0b9f0940-2c stacker@dev01:~$ sudo ip netns exec qdhcp-487dbcfc-62fb-4b87-ad8c-8fa4c015ac0a ifconfig | egrep '(encap|addr)' | grep -v inet6 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 tap0b9f0940-2c Link encap:Ethernet HWaddr fa:16:3e:05:4b:d6 inet addr:192.168.0.2 Bcast:192.168.0.255 Mask:255.255.255.0
The above shows the separate interfaces and routing tables that are private to the namespaces. Interestingly enough, interfaces from any namespace can be added to a OVS bridge, which is how the “qrouter” interfaces attach to
br-int and tie them together. In the same way, the Quantum DHCP service is attached to the
br-int bridge, and thus provides DHCP service to that IP space for the instantiated VMs. Also note in the OVS show output above, that all the ports for this tenant on
br-int are tagged into VLAN 1 (
tag: 1) – each tenant network would get a separate VLAN on the
br-int switch, thus keeping their traffic separated on the switch.
The other thing that seemed odd to me was the separate linux bridge for each VM (the VM’s vNIC is represented in linux as the
vnet# interface.) Why wasn’t the
vnet# interface attached directly to the OVS switch? Turns out that the linux bridge is a part of how the VM security rules (“nova secgroup”) are applied (I don’t exactly know how this happens yet; this also seems to be a remnant of the prior networking code in OpenStack which will eventually be added into Quantum, thus obviating the need for the bridge.) Anyways, to tie the linux bridge into the OVS bridge, a mechanism named a “veth pair” is employed; this is a sort of a tunnel when packets sent in one interface just come out the associated interface. The one interface (named
qvo... as in “Quantum veth OVS side”) is added to the OVS bridge, and the other end (named
qvb... as in “Quantum veth bridge side”) is added to the linux bridge, thus linking the two bridges together.
Having the pieces of the puzzle, and knowing what the various interfaces were and what they were used for, I quickly found the problem, which was that the DHCP service was not sending offers back to the booting VMs. Fortunately, since DevStack is so easy to set up (and knowing it was made to be turnkey) I wiped out the existing DevStack installation (removing the conf files that it had put in various places as well), carefully rewrote the
localrc file, and re-ran the
git clone... /
./stack.sh commands, and this time, the DHCP service worked. (I believe I had originally included some config statements in my first
localrc file that were meant for a multi-node server, which I guess screwed up the DHCP service.) It had cost me a lot of time, but I’m kinda glad I had the original problem, and learned how Quantum is (currently) implemented in Linux. Please be reminded, that my installation is about the simplest possible installation (using the new Quantum network system) and that a “real” OpenStack installation (especially one with multiple tenants) would be much more “interesting”.
In closing, I’d really like to thank Brent Salisbury for not only blogging about this stuff, but also jumping on IRC at the end of a long day and trying to help me out with troubleshooting. Also, this Slideshare deck from a fellow named Etsuji Nakai was instrumental in helping me understand the Linux internals, and the naming conventions.