This post is the second in a series intended to familiarize users with MidoNet’s overlay virtual networking approach and models. In part 1 we discussed MN’s Provider Router. In this article we discuss Tenant Routers and Bridges.
A Tenant (or Project in OpenStack’s terminology) is an organizational unit that shares ownership of a set of virtual devices. For example, in MidoNet a single Tenant may own a set of virtual routers, virtual bridges and rules/chains; similarly, in Neutron a Project may own a set of routers, networks, subnets, security groups in Neutron.
In OpenStack/Neutron, typically each Project owns a single Router and one or more Networks. One possible Tenant workflow is:
- Create a Neutron Router.
- Set the Router’s gateway – this is an External Network, and an IP address in one of that network’s prefixes.
- (Implicit/automatic unfortunately) An IP from the FloatingIP (e.g. 220.127.116.11) range is allocated and port-masquerading set up for traffic traversing the Router’s uplink.
- Create a Neutron Network and name it. Create a Subnet and associate it with that Network. This specifies the IP address range for that network (e.g. 10.10.0.0/24), the gateway address (e.g. 10.10.0.254) and some DHCP options. Multiple Subnets are allowed; both IPv4 and IPv6 ranges are allowed.
- Add an interface to the Router (neutron router-interface-add CLI command) on the Subnet(s) – this connects the Router to the Subnet, and assigns it the specified gateway address. A single port will be created on the Subnet’s Network. If the router has an interface on multiple Subnets of the same Network, the same port will be re-used.
- Launch VM instances. For each instance, specify the number of vNICs, and for each vNIC what Network it should be attached to. Neutron will automatically create one port per vNIC on the appropriate Network. For each Network port created, Neutron generates one MAC address and chooses one IP address from each Subnet range. Continuing the example above, assume IP 10.10.0.1 is chosen. The MAC and IP addresses are stored in Neutron DB, typically MySQL. Only then will Nova scheduler choose a compute host to spin up the instance and the Nova agent local to that host will create the VM with the appropriate number of vNICs.
Here’s what happens in MidoNet’s low-level models for each of those workflow steps:
- A MidoNet virtual router is created and stored in ZooKeeper. MN virtual routers are completely distributed and simulated at the agent/software switch at flow computation/installation time.
- A virtual port P1, is created on the tenant virtual router to serve as an uplink; a port P2 is created on the Provider Router. P1 and P2 are linked, and the virtual router’s routing table gets a default route to the Provider Router via that link.
- Neutron doesn’t have explicit IPAM, just default behavior. The Tenant is meant to be a private domain. Therefore, at this step the Provider Router has no route via that link. Outside traffic will not be forwarded to the Tenant’s router until a FloatingIP is allocated to the Tenant.
- In MidoNet’s terminology P1 and P2 are interior virtual ports. Interior virtual ports exist entirely within the overlay network and don’t map to any physical device/port. In contrast exterior virtual ports are considered to be at the edge of the overlay and connect the overlay to a VM instance or to external L2 or L3 networks. Exterior ports must be associated with network interfaces (physical or logical) on physical hosts where MN Agents are installed.
- When the port-masquerading IP is allocated from the FloatingIP range (18.104.22.168 in this example), the Provider Router gets a /32 route (to that FloatingIP) via the link to the Tenant’s router. The Tenant Router’s Post-routing Chain gets a rule that matches packets egressing the uplink and with private source address and applies a SNAT: the source IP is translated to the FloatingIP, the source L4 port is translated to a dynamically chosen value in the privileged or ephemeral port range according to whether the original source port was privileged or not. The Tenant router’s Pre-routing Chain gets a rule that matches packets ingressing the uplink and with destination IP matching the FloatingIP and that reverses the SNAT by looking up the translation in the forward flow’s state.
- MidoNet’s port-masquerading is entirely distributed and is decided flow-by-flow by the MN Agent local to the flow. It does not require forwarding the packets through an L3 namespace or router appliance.
- MidoNet’s Chains and Rules will be described in detail in a subsequent post.
- FloatingIP’s used in the normal way (statically mapped to a single VM/private IP) result in static NAT rules in the Tenant router’s Pre-routing and Post-routing Chains. This will be described in a subsequent post.
- When the Neutron Network is created, a corresponding MidoNet virtual bridge is created. When a Subnet is created, a corresponding MidoNet DHCPSubnet object is created. Any information related to the Subnet is stored in the DHCPSubnet object in ZooKeeper.
- When the Router is connected to the Network/Subnets, a port P3 is created on the corresponding MN virtual bridge, it will serve as the bridge’s uplink. A port P4 is created on the Router with IP address/prefix equal to the gateway IP specified in the Subnet (10.10.0.254 in the example above); a 10.10.0.254/32 Local route and a route to the prefix 10.10.0.0/24, both via P4, are added to the Tenant Router’s routing table. The /32 route allows the router to recognize traffic to P4 that arrives via a different port. The virtual bridge gets two static entries: in the mac-table, the router P4 port’s MAC must map to bridge port P3; in the ARP table (the bridge answers ARPs when it can) the router P4 port’s IP must map to its MAC.
- When a Neutron port is created for a VM, MidoNet creates a corresponding exterior port, let’s call it P5, on the appropriate virtual bridge. MidoNet stores the selected (by Neutron) MAC and IP addresses in the DHCPSubnet object associated with that bridge. When Nova agent, let’s say on Compute Host5, launches a VM instance (typically via libvirt and KVM) it creates software interfaces (taps) for each of the VM’s vNICs and then invokes a Python hook that enables a Neutron-plugin-or-driver-specific callbacks. Let’s assume tap123 was created for P5. MidoNet’s hook code makes a call to the MidoNet API to tell it that “tap123 on Host5 is bound to P5”. MN API stores this information in Apache ZooKeeper in a directory specific to Host5. The MN Agent on Host5 is watching that directory and realizes that it needs to plug tap123 into its datapath. MN Agent therefore makes a netlink call to the OVS datapath to insert tap123 as a netdev device, and tap123 appears as port #10 in the datapath.
- When the VM finishes booting it will issue a DHCP message of type Discover. The packet will miss in the datapath and will be kicked up to MN Agent in userspace (same as OVS kmod kicking missed packets up to OVS vSwitchd). The MN Agent realizes that the packet came from tap123 and therefore from P5 in the overlay topology. The Agent checks whether it can generate the DHCP reply (an Offer message) by looking for the Discover‘s source MAC in the Bridge’s DHCPSubnet. In this case it will find the MAC-IP mapping and therefore generates a DHCP Offer with the appropriate 10.10.0.1 IP address offer, and any additional options (default routes, non-default routes, DNS servers) specified in the Subnet. (And similarly for the DHCP Request and Acknowledge that will soon follow).
- Note that the DHCP responses are generated by the MN Agent local to the VM’s host. This is a common theme in MidoNet, we try to do as much work as possible at the edge. The MN Agent is aware of the overlay topology model, that’s why we refer to this approach as Topology-Aware Switch.
Readers familiar with Neutron will have noticed that I omitted Security Groups. Part 3 of this series will discuss Security Groups as well as Floating IPs (not the port-masquerading kind).