Physical server provisioning with OpenStack*

This article explores the internal details of provisioning a physical machine using OpenStack*. Steps for setting up OpenStack are included, and no special hardware is required to begin use. If you already use OpenStack, follow these instructions using the hardware you currently have. If you are new to OpenStack, you will need a commodity access switch and two physical servers with ports connected to a switch that is on the same VLAN or broadcast domain.

Intel's DevStack patch

Intel has submitted patches that enable DevStack physical server provisioning . Before these patches, baremetal provisioning on DevStack could only be simulated by having physical machines replaced by VMs. With Intel’s patch, users can now easily configure DevStack to provision physical servers through the Ironic process.

OpenStack makes it possible to have VMs simulate physical machines with the Ironic pxe_ssh driver. The pxe_ssh driver manages nodes by deploying the operating system using PXE boot and controlling the power status using virsh commands sent via SSH connections. DevStack also creates an OpenVSwitch bridge for these VMs to communicate with other services provided by OpenStack such as the PXE server and the dnsmasq service.

agent_ipmitool driver in Ironic for provisioning physical servers and create a flat provider network environment to give the physical machines access to the DHCP service, which is hosted in a virtual network environment. The following sections describe the two new features, focusing on the differences between the prior and new VM simulation models and the technical details that allow OpenStack to provision physical machines.

Ironic pxe_ssh driver vs. agent_ipmi driver

The Juno version of Ironic has eight drivers. These driver classes are inherited from a base class that defines six types of interfaces: power, deploy, console, management, rescue, and vendor. These interfaces are assigned with a module class instance that has the extension points that define the vendor-specific implementations of each Ironic driver. For example, the pxe_ssh driver uses the PXE implementation module for deploy and vendor interfaces while the SSH implementation module is used for power and management interfaces. The rationale for this is a VM’s boot device can be configured to use PXE, but the power state of VMs is only manageable through the CLI of the hypervisor it is running on. However, when provisioning physical machines, there is no hypervisor involved in the process; therefore, a different Ironic driver should be used in this case to do the task.

The agent_ipmitool driver is a widely used Ironic driver for physical machine provisioning. It uses the agent modules for deploy and vendor interfaces and leverages the IPMI tool module for the power and management interfaces. DevStack will register a custom ramdisk to the image repository when the agent_ipmitool driver is set as the deploy driver for Ironic. This custom ramdisk has an Ironic Python Agent, known as IPA, included that will execute disk partitioning, OS image installation, etc. The IPMI tool module uses the ipmitool utility to set boot devices and change power states of physical machines. Figures 1 and 2 show how each driver interacts with other services and the provision target. The DevStack patch from Intel enables the use of the agent_ipmitool driver through DevStack, which makes Ironic capable of sending IPMI requests to physical machines.

pxe ssh driver workflow
Figure 1. pxe_ssh driver workflow. The boot device setting and power management is done using the virsh command via ssh. OS installation follows the PXE routine.

agent ipmitool
Figure 2. agent_ipmitool driver workflow. Node management is done using the ipmitool utility. The custom ramdisk includes a “Ironic Python Agent (IPA)” which is a python process that manages OS image installation.

VLAN provider network vs. Flat provider network

Another essential aspect when provisioning physical machines with OpenStack is the network configuration. Ironic is dependent on Neutron for DHCP service support. Neutron manages the MAC address and IP address pairing in a dnsmasq process. The reason many people get frustrated in this part is because the dnsmasq services are always launched in a Linux* namespace connected to a tap on an OpenVSwitch bridge. This can make it hard to understand how the physical machine could possibly send a DHCP request through the virtual network.

The solution to this problem is to set up a flat provider network instead of the VLAN, GRE, or local type network. To understand why using the flat provider network is the only solution, we should observe how the dnsmasq and tftp services are accessed when provisioning VMs compared to when provisioning physical machines.

Figure 3 shows the network settings when Ironic is configured to VM provisioning mode in DevStack. In this case, a VLAN provider network is configured by Devstack where the VM is connected to bridge “brbm” and the dnsmasq process is connected to “br-int.” The VM can send DHCP requests to the dnsmasq process since the taps connected to the VM and dnsmasq have identical VLAN tags. After the VM receives the PXE boot configurations inside the DHCP response, it can request image downloads to the tftp server because a layer 3 SNAT is conducted when the frame hits the route table set on “br-ex.”

The physical machine in Figure 3, located external to the DevStack node, cannot communicate with the dnsmasq service because it has no knowledge of the segmentation ID, which is unique to each virtual network created in VLAN type network mode. The segment ID is a VLAN tag used to isolate each virtual network. To communicate within a virtual network, you must include the segment ID of that network in the packet you send to “br-eth2.” In Figure 3, let’s say the virtual network created has a segment ID of 1001 and the corresponding OpenVSwitch internal ID is 2.

vm access route
Figure 3. VM access route to services. VLAN type networks partition the network using a “segment id” for each tenant. An external device must have the “segment id” inserted in the packet to access the virtual network. OpenVSwitch maintains a mapping between “segment id” and “internal VLAN tag”.

After you create the virtual network, two OpenFlow* rules will be set on bridges “br-eth2” and “br-int.” The OpenFlow rule set on “br-eth2” is if an incoming frame has a VLAN ID of 1001, then change it to 2. The rule set on “br-int” is if an outgoing frame has a VLAN ID of 2, then change it to 1001. Therefore, since a physical server has no information about the segment ID, it is impossible just to access “br-int.”

Figure 4 shows the network settings of a flat provider network in Neutron. The reason the flat provider network enables physical machine provisioning is because different OpenFlow rules are set to “br-eth2” and “br-int” compared to the VLAN provider network. The OpenFlow rule set on “br-eth2” is to set the VLAN tag for all incoming frames to 2. The OpenFlow rule set on “br-int” is to strip VLAN tags for all outgoing packets. This means that all DHCP requests broadcasted external to the DevStack node can access the dnsmasq service inside the virtual network. Therefore, the physical machine can receive a DHCP response with PXE boot configurations and successfully install its OS and complete the provisioning process.

physical machine access
Figure 4. Physical machine access route to services. Flat provider network inserts an “internal VLAN tag” to an incoming packet. All outgoing packets have their VLAN tag stripped.

Hands-on session

Configure physical server IPMI settings

Set your IPMI environment in the BIOS menu of the physical machine. The location is under “Server Management” in the “BMC LAN Configuration” section if your BIOS manufacturer is American Megatrends as in Figure 5. For other manufacturers, look for a page with a similar title. You must write down three items at the end of the configuration process: “IP address,” “User ID,” and “User password.”

IPMI configuration menu
Figure 5. IPMI configuration menu. Write down the IP address and IPMI user credentials.

Check the physical machine NIC MAC address

One of the NIC’s MAC addresses of your physical machine has to be passed to DevStack. The MAC address can be found under “Advanced BIOS Features” in the “PCI Configuration” section as in Figure 6. Many NICs should be listed in the menu, and all of them have access to the BMC. Write down the “NIC1 Mac Address.”

Create a “hardware_info” file

This hardware_info file will be passed to DevStack so that Ironic knows the required information of your physical machines. Log in to your DevStack node and run the script below in the shell terminal. The variables IPADDR, MACADDR, USERID, and USERPW are “IP address,” “NIC1 Mac Address,” “User ID,” and “USER password,” respectively. If you have more than one physical server, run the script multiple times changing the values.

IPADDR=10.1.91.4
MACADDR=00:1B:24:78:4C:B2
USERID=root
USERPW=password
cat>>/var/tmp/hardware_info <<END
$IPADDR $MACADDR $USERID $USERPW
END

 

PCI configuration menu
Figure 6. PCI configuration menu. Write down a MAC address.

Configure Network Bridge

The flat provider network requires a bridge that gives external access to your devices in the virtual network. In Figure 4, “br-eth2” is used as the bridge that connects devices on “br-int” to have access to the physical machine. In this article, the bridge is created on “eth2.” Change “eth2” to “eth1” if you only have “eth1” available. Set an IP address to the bridge according to your network environment.

ovs-vsctl add-br br-eth2
ovs-vsctl add-port br-eth2 eth2
ifconfig br-eth2 192.168.2.10 netmask 255.255.255.0

Clone and configure DevStack

Clone DevStack source code and create a new user account. The create-stack-user.sh script of DevStack will help you create a “stack” account. Running the script below will create the “stack” account and move DevStack clone to /opt/stack/devstack where all other projects will eventually be cloned to.

git clone git://git.openstack.org/openstack-dev/devstack.git
sudo devstack/tools/create-stack-user.sh
sudo mv devstack/ /opt/stack/
chown -R stack:stack /opt/stack/devstack/
su - stack
script /dev/null

Create a localrc file for baremetal provisioning in your DevStack home directory.

PASSWD=password
HOSTIP=192.168.1.10 #your DevStack node IP address
IPRANGE=192.168.1.0/24 #available CIDR in your network environment
GATEWAYIP=192.168.1.1 #IP address of the gateway
BRIDGE=br-eth2
cat>>/opt/stack/devstack/localrc<<END 
ADMIN_PASSWORD=$PASSWD
MYSQL_PASSWORD=$PASSWD
RABBIT_PASSWORD=$PASSWD
SERVICE_PASSWORD=$PASSWD
SERVICE_TOKEN=$PASSWD

MULTI_HOST=1
HOST_IP=$HOSTIP
VOLUME_BACKING_FILE_SIZE=100000M
VOLUME_BACKING_FILE=/data/openstack/stack-volumes-backing-file

enable_service ironic
enable_service ir-api
enable_service ir-cond

disable_service n-net
enable_service q-svc
enable_service q-agt
enable_service q-dhcp
enable_service neutron

VIRT_DRIVER=ironic

LOGFILE=stacklog-ironic
LOGDAYS=3

IRONIC_HARDWARE=True
Q_ML2_TENANT_NETWORK_TYPE=flat
ENABLE_TENANT_TUNNELS=False
PHYSICAL_NETWORK=physnet1
OVS_PHYSICAL_BRIDGE=$BRIDGE
Q_ML2_PLUGIN_TYPE_DRIVERS=flat
Q_ML2_PLUGIN_MECHANISM_DRIVERS=openvswitch
ENABLE_TENANT_VLANS=False

Q_USE_PROVIDER_NETWORKING=True
PUBLIC_INTERFACE=eth1
FLAT_NETWORK_BRIDGE=$BRIDGE
PROVIDER_NETWORK_TYPE=flat
IRONIC_BAREMETAL_BASIC_OPS=True
FIXED_RANGE=$IPRANGE
IRONIC_HWINFO_FILE=/var/tmp/hardware_info
IRONIC_DEPLOY_DRIVER=agent_ipmitool
NETWORK_GATEWAY=$GATEWAYIPEND

Deploy DevStack

Run /opt/stack/devstack/stack.sh to let DevStack install your OpenStack environment. To verify your deployment, run the commands below and check the screen output.

nova flavor-list
neutron net-list
glance image-list

OpenStack command line
Figure 7. OpenStack* command line tool output, listing the flavors, networks, and images to verify your DevStack installation.

The screen output should be similar to Figure 7. Also check if your physical machine properties are correctly set in Ironic.

UUID=$(ironic node-list | awk '/ power / {print $2}')
ironic node-show $UUID

Verify that the “ipmi_username,” “ipmi_address,” and ipmi_password” match the IPMI properties you have set in the physical machine’s BIOS and hardware_info files as in Figure 8.

ironic node description
Figure 8. Ironic node description screen output. Verify that the items in “driver_info” match with your hardware_info file.

Make sure the “Power State” of your physical machine is “power off” before you proceed. In case the state shows “None,” run the command below to force power off.

UUID=$(ironic node-list | awk '/ power / {print $2}')
ironic node-set-power-state $UUID power off

Provision physical server

Now let’s provision the physical machine by running the command below:

NETUUID=$(neutron net-list | awk "/ physnet1 / { print $2 }")
IMGUUID=$(glance image-list | awk "/ cirros.*bare / { print $2 }")
nova boot --flavor baremetal --nic net-id=$NETUUID --image $IMGUUID test

The screen output should be similar to Figure 9.

Nova boot command
Figure 9. Nova boot command output. Periodically execute “ironic node-list” to check the status of the provisioning process.

At this point your physical machine should be deploying the Ironic custom ramdisk fetched from the PXE server.

physical boot screen
Figure 10. Boot screen of the physical machine. Check if the IP and gateway match with your localrc file.

After the booting process, the IPA will install the OS image and reboot again. The physical machine has successfully been provisioned if you see the login terminal.

Conclusion

Provisioning physical machines with Ironic used to be possible only if you understood Ironic’s source code. With Intel’s patch contributions, even novice users can easily set up baremetal provisioning environment from a fresh clone of DevStack. However, the true value of these patches is that they help you better understand the technology that enables physical machine provisioning in OpenStack. Furthermore, you could enable Ironic’s baremetal provisioning feature to your production environment and use OpenStack to manage all physical machines in your datacenter.

For more complete information about compiler optimizations, see our Optimization Notice.