OpenStack Quantum Network Implementation in Linux

So there I was, a shiny new OpenStack Folsom install (via DevStack) on a single server, and everything seems to be working — except I can’t reach the darn VMs I’ve spawned! I had a vague idea of the network topology the installer had created (driven by the choice to use the new Quantum networking service over the prior built-in networking provided by the base Nova service), but really had no idea how to go about troubleshooting this new virtual network that had been instantiated. Off to Google I went to search for answers, and that began the journey down this particular rabbit hole…

Let me back up a bit… I’ve been hearing about this “blah blah Cloud” thing (as our host Greg likes to put it) for quite some time, and had a basic understanding of what it was and what it could be good for. I work for the American R&D lab of a large multinational computer company headquartered in Japan, and I knew some of the research folks here had been investigating various cloud-related technologies, and had even used Amazon’s AWS in various projects. Then (for a short period of time a few years ago) we had our own “internal cloud” – one of the research dept. sysadmins had constructed a multi-node AWS clone, based on the open-source Eucalyptus project. Unfortunately, use was light, and it was eventually abandoned since the hardware used was aging, and the aforementioned admin had moved on.

Fast forward to about a week ago. Another researcher came up to me, and asked if her dept. could use the OpenStack implementation that we ran. “Huh?” I replied, “We don’t have any OpenStack implementation here.” (Perhaps she was thinking of the prior Eucalyptus setup.) Turns out that one of their research clients had asked them about what we had around it, and they were scrambling to get a research implementation available. Having been interested in the area of virtualization for some time, and exploring some of the underpinning technologies of OpenStack (namely KVM and Open vSwitch), I jumped at the chance to get involved. So luckily enough, they had a large server available that was bought to run VMware on, but had never gotten installed. So after a few hours of hardware reconfiguration and Linux installation (Ubuntu 12.04 in this case), we were ready to begin. Since it was only a single-server proof-of-concept, DevStack was the perfect fit, and it really couldn’t be easier —

stacker@dev01$ sudo apt-get install git -y
stacker@dev01$ git clone git://github.com/openstack-dev/devstack.git
stacker@dev01$ cd devstack
stacker@dev01$ vi localrc
stacker@dev01$ ./stack.sh

(command outputs omitted for brevity)

For the localrc directives, I followed instructions included in a blog post by the always-amazing Brent Salisbury (of course the IP address ranges being modified for my organization) and some 257 seconds and a voluminous amount of screen output later, I had a running OpenStack installation. So, I promptly logged in, took a look around, and spun up a couple of VM instances. Upon trying to ping them however from the host running DevStack, my joy quickly faded. The VMs were being reported by the OpenStack web UI (known as “Horizon”) as well as the CLI management client (known as “nova”) as having a private IP address assigned. But, even though it looked like the correct routing was in place (DevStack in my install has a public IP range known as the “floating range” that the “br-ex” Open vSwitch has an interface address out of, and a RFC1918 private range that the VMs get addressed from, with br-ex acting as the gateway to it according to the linux routing table) and appropriate security rules in place (the VMs are firewalled off from the public by default, and rules must be written to allow desired access), I could not reach the VMs. This put me in a bit of a quandary, as the only way to log in to the VMs is by certificate-based SSH. No network access, no VM netstack investigation. I suspected that the VMs may not have gotten their assigned addresses correctly from DHCP, even though the OpenStack system was reporting their assignment. But was it a routing issue of some sort, a DHCP server problem, or a VM addressing problem? I figured I’d have to do some network sniffing, and that meant that I had to figure out the topology first.

First let’s look at the host machine’s routing table:

stacker@dev01:~/devstack$ netstat -nr
Kernel IP routing table
Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface
0.0.0.0         111.22.100.254  0.0.0.0         UG        0 0          0 eth0 
111.22.100.0    0.0.0.0         255.255.255.0   U         0 0          0 eth0 
111.22.100.128  0.0.0.0         255.255.255.240 U         0 0          0 br-ex
169.254.0.0     0.0.0.0         255.255.0.0     U         0 0          0 eth0
192.168.0.0     111.22.100.130  255.255.255.0   UG        0 0          0 br-ex

So I knew that the route to the private network that the VMs were on was thru the br-ex interface (Open vSwitch [OVS] can be set up with an IP address and then appears in the linux routing table as an interface), but I also knew that OVS does not act as a L3 router itself. I further knew that the Quantum subsystem in OpenStack has the concept of routers, so there must be a router on the other side of the OVS bridge; I made the (correct) assumption that the “111.22.100.130” gateway address in the routing table above must be the Quantum router’s “external” interface. But what was the topology past that router?

I decided to take a look at the OVS “show” command, which lists all OVS switches and their ports. OVS also implements a “brcompat” module, which allows the user of the traditional linux brctl commands, showing both OVS and any traditional linux bridges. The results surprised me:

stacker@dev01:~$ sudo ovs-vsctl show
adee4e67-940e-44e5-a7fe-e74a41739d7d
    Bridge br-ex
        Port br-ex
            Interface br-ex
                type: internal
        Port "qg-5dd85641-81"
            Interface "qg-5dd85641-81"
                type: internal
    Bridge br-int
        Port "tap0b9f0940-2c"
            tag: 1
            Interface "tap0b9f0940-2c"
                type: internal
        Port "qvoa0bca375-a9"
            tag: 1
            Interface "qvoa0bca375-a9"
        Port "qr-fce20708-25"
            tag: 1
            Interface "qr-fce20708-25"
                type: internal
        Port "qvo10d38cf7-7e"
            tag: 1
            Interface "qvo10d38cf7-7e"
        Port br-int
            Interface br-int
                type: internal
    ovs_version: "1.4.0+build0"

stacker@dev01:~$ brctl show
bridge name     bridge id               STP enabled     interfaces
br-ex           0000.8ae69a3f4a46       no              qg-5dd85641-81
br-int          0000.4a549f449943       no              qr-fce20708-25
                                                        qvo10d38cf7-7e
                                                        qvoa0bca375-a9
                                                        tap0b9f0940-2c
qbr10d38cf7-7e  8000.4effc18aaf4d       no              qvb10d38cf7-7e
                                                        vnet0
qbra0bca375-a9  8000.9ed1f7f181dc       no              qvba0bca375-a9
                                                        vnet1

What became immediately evident was that there were a number of interfaces, as well as two traditional linux bridges, all beginning with “q”, which I took to be an indicator that they were created and used by Quantum. On further investigation of my system combined with Googling how Quantum implements networks, a number of concepts were revealed to me, and the logical topology became clear, followed by how this was actually impemented in linux:


There were two new concepts to me that I had to digest:

  1. Linux network namespaces
  2. “veth” pairs

It turns out that the router, as well as the DHCP server, is assigned per tenant in the Quantum system. Since different tenants may utilize the same IP space behind their router, the router and the DHCP service for each network behind the router must be run in separate namespaces (which is like VRF on traditional routers) to prevent IP overlap clashes. The linux command to show these namespaces is ip netns list. You may also run the traditional linux routing / bridging commands scoped to such namespace by using the ip netns exec <namespace-id> <cmd> syntax. Here’s the output from my system, that has only one router with one network behind it:

stacker@dev01:~$ ip netns list
qrouter-43ffd046-ca80-450c-b6aa-b3340a69ed3e
qdhcp-487dbcfc-62fb-4b87-ad8c-8fa4c015ac0a

stacker@dev01:~$ sudo ip netns exec qrouter-43ffd046-ca80-450c-b6aa-b3340a69ed3e \
netstat -nr
Kernel IP routing table
Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface
0.0.0.0         111.22.100.129  0.0.0.0         UG        0 0          0 qg-5dd85641-81
111.22.100.128  0.0.0.0         255.255.255.240 U         0 0          0 qg-5dd85641-81
192.168.0.0     0.0.0.0         255.255.255.0   U         0 0          0 qr-fce20708-25

stacker@dev01:~$ sudo ip netns exec qrouter-43ffd046-ca80-450c-b6aa-b3340a69ed3e \
ifconfig | egrep '(encap|addr)' | grep -v inet6
lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
qg-5dd85641-81 Link encap:Ethernet  HWaddr fa:16:3e:c3:9c:d9
          inet addr:111.22.100.130  Bcast:111.22.100.143  Mask:255.255.255.240
qr-fce20708-25 Link encap:Ethernet  HWaddr fa:16:3e:f2:e8:76
          inet addr:192.168.0.1  Bcast:192.168.0.255  Mask:255.255.255.0

stacker@dev01:~$ sudo ip netns exec qdhcp-487dbcfc-62fb-4b87-ad8c-8fa4c015ac0a \
netstat -nr
Kernel IP routing table
Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface
192.168.0.0     0.0.0.0         255.255.255.0   U         0 0          0 tap0b9f0940-2c

stacker@dev01:~$ sudo ip netns exec qdhcp-487dbcfc-62fb-4b87-ad8c-8fa4c015ac0a \
ifconfig | egrep '(encap|addr)' | grep -v inet6
lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
tap0b9f0940-2c Link encap:Ethernet  HWaddr fa:16:3e:05:4b:d6
          inet addr:192.168.0.2  Bcast:192.168.0.255  Mask:255.255.255.0

The above shows the separate interfaces and routing tables that are private to the namespaces. Interestingly enough, interfaces from any namespace can be added to a OVS bridge, which is how the “qrouter” interfaces attach to br-ex and br-int and tie them together. In the same way, the Quantum DHCP service is attached to the br-int bridge, and thus provides DHCP service to that IP space for the instantiated VMs. Also note in the OVS show output above, that all the ports for this tenant on br-int are tagged into VLAN 1 (tag: 1) – each tenant network would get a separate VLAN on the br-int switch, thus keeping their traffic separated on the switch.

The other thing that seemed odd to me was the separate linux bridge for each VM (the VM’s vNIC is represented in linux as the vnet# interface.) Why wasn’t the vnet# interface attached directly to the OVS switch? Turns out that the linux bridge is a part of how the VM security rules (“nova secgroup”) are applied (I don’t exactly know how this happens yet; this also seems to be a remnant of the prior networking code in OpenStack which will eventually be added into Quantum, thus obviating the need for the bridge.) Anyways, to tie the linux bridge into the OVS bridge, a mechanism named a “veth pair” is employed; this is a sort of a tunnel when packets sent in one interface just come out the associated interface. The one interface (named qvo... as in “Quantum veth OVS side”) is added to the OVS bridge, and the other end (named qvb... as in “Quantum veth bridge side”) is added to the linux bridge, thus linking the two bridges together.

Having the pieces of the puzzle, and knowing what the various interfaces were and what they were used for, I quickly found the problem, which was that the DHCP service was not sending offers back to the booting VMs. Fortunately, since DevStack is so easy to set up (and knowing it was made to be turnkey) I wiped out the existing DevStack installation (removing the conf files that it had put in various places as well), carefully rewrote the localrc file, and re-ran the git clone... / ./stack.sh commands, and this time, the DHCP service worked. (I believe I had originally included some config statements in my first localrc file that were meant for a multi-node server, which I guess screwed up the DHCP service.) It had cost me a lot of time, but I’m kinda glad I had the original problem, and learned how Quantum is (currently) implemented in Linux. Please be reminded, that my installation is about the simplest possible installation (using the new Quantum network system) and that a “real” OpenStack installation (especially one with multiple tenants) would be much more “interesting”.

In closing, I’d really like to thank Brent Salisbury for not only blogging about this stuff, but also jumping on IRC at the end of a long day and trying to help me out with troubleshooting. Also, this Slideshare deck from a fellow named Etsuji Nakai was instrumental in helping me understand the Linux internals, and the naming conventions.

Will Dennis

Will Dennis

Will Dennis has been a systems and network administrator since 1989, and is currently the Network Administrator for NEC Laboratories America, located in Princeton NJ. He enjoys the constant learning it takes to keep up with the field of network and systems administration, and is currently pursuing the Cisco CCNP-R/S certification. He can be found on the Twitters as @willarddennis, and on Google Plus.
  • Steven Vacaroaia

    could you please post your localrc ( without any confidential data , ofcourse )
    I am having same issues like you

  • GK

    Good in-sight in to various interfaces that get piled up as and when you launch new VMs. Apprecaite sharing your learning with others.

  • Anahita

    Perfect…

  • TishaKramer

    I appreciate how you took the time to explain the network layers and components. Thanks!

  • marvs

    Thanks for the insight.

    I had the same problem and google brought me here :)

    I understood what went under the hood but couldn’t fix the problem but after some research I figured it out.

    Here’s my localrc (for a single node setup with a single NIC with Quantum enabled) :

    disable_service n-net
    enable_service q-svc
    enable_service q-agt
    enable_service q-dhcp
    enable_service q-l3
    enable_service q-meta
    enable_service quantum

    MULTI_HOST=False
    HOST_IP=Your_NIC_IP_for_example_10.2.2.2
    FIXED_RANGE=private_network_range_for_example_50.50.50.0/24
    EXT_GW_IP=YOUR_gateway_for_example_10.2.2.1
    FLOATING_RANGE=floating_range_for_example_60.60.60.0/24
    NETWORK_GATEWAY=Gateway_for_VMs_50.50.50.1

  • xie wei

    Hi will, thanks for the great post and useful reference slides.

    I encountered similar problem when deploying using devstack, the VM can’t bring up eth0 interface, at last I found the the reason is devstack not bringing up the tap interface.

    I can boot the VM with network now, and I can access the VM from qrouter namespace, but the problem is I can’t access the VM from external network, the routing entry to the VM private network is there like 10.0.0.0/24 via 172.24.4.226(qg-XXX interface address) dev br-ex. I can see the packets in qg-XXX interface in qrouter namespace, but no packets in qr-XXX interface, any suggestion?

    thanks.

  • adu

    I have exactly the same set up but I am not able to ping the VM from the outside network. Like your network, eth0, br-ex and the ‘qgXXX’ port have 3 IP address from the external subnet. But when I ping from an outside host in the same subnet, it is not able to ping the ‘qgXXX’ port. I can see the ARP request reaching eth0 but since the ‘qg’ port is behind eth0, the local LAN switch never receives the response. Strangely, eth0 and br-ex (100.100 and 100.129) in your diagram can be pinged. Not sure if they are the same but they seem to be! Any idea how the ‘qg’ port can access the outside network?
    Thanks for the detailed explanation of the internals!

  • JOSE FERNANDO CECILIO

    VER VALORES EM EURO