Virtualizing legacy hardware in OpenStack

A five years old hardware is being decommissioned and hosts fourteen vservers on a Debian GNU/Linux lenny running a 2.6.26-2-vserver-686-bigmem linux kernel. The April non profit relies on these services (mediawiki, pad, mumble, etc. ) for the benefit of its 5,000 members and many working groups. Instead of migrating each vserver individually to an OpenStack instance, it was decided that the vserver host would be copied over to an OpenStack instance.
The old hardware has 8GB of RAM, 150GB disk and a dual Xeon totaling 8 cores. The munin statistics show that no additional memory is needed, the disk is half full and an average of one core is used at all times. A 8GB RAM, 150GB disk and dual core openstack instance is prepared. The instance will be booted from a 150GB volume placed on the same hardware to get maximum disk I/O speed.
After the volume is created, it is mounted from the OpenStack node and the disk of the old machine is rsync’ed to it. It is then booted after modifying a few files such as fstab. The OpenStack node is in the same rack and the same switch as the old hardware. The IP is removed from the interface of the old hardware and it is bound to the OpenStack instance. Because it is running on nova-network with multi-host activated, it is bound to the interface of the OpenStack node which can take over immediately. The public interface of the node is set as an ARP proxy to advertise the bridge where the instance is connected. The security group of the instance are disabled ( by opening all protocols and ports ) because a firewall is running in the instance.

Collocated hardware

The OpenStack cluster used to migrate the legacy hardware is configured to allow the collocation of instances and volumes on the same hardware. One OpenStack availability zone groups hardware located in the same rack and uses the same switch as the legacy hardware. This allows for a migration that does not involve changing the IP of the machine. If the OpenStack nodes were located in a different autonomous system, a DNS change would be necessary and require additional preparations.

Maintenance LAN connection

The primary IP address used by the legacy hardware is also used by a number of services provided by the vservers it hosts. Moving this IP address to the OpenStack instance would mean losing access to the legacy hardware, without any hope to fallback, should something unexpected happen. Because both machines involved in the migration are connected to the same switch and use the same VLAN, an additional IP address is manually added to preserve communications:

ns1 : ip addr add 10.222.222.1/24 dev eth0
yopo : ip addr add 10.222.222.2/24 dev eth0

Preparations

In the following, desktop is any machine on which there are enough credentials to either connect to the legacy machine using ssh, run nova commands or EC2 commands targeting the OpenStack cluster, yopo is the OpenStack node, ns1 is the legacy hardware.

desktop: euca-create-volume --zone bm0008 --size 150
+----+-----------+--------------+------+-------------+-------------+
| ID |   Status  | Display Name | Size | Volume Type | Attached to |
+----+-----------+--------------+------+-------------+-------------+
| 3f | available | None         | 150  | None        |             |
+----+-----------+--------------+------+-------------+-------------+
desktop: nova volume-list | grep " $(printf "%d" 0x3f) "
| 63 | available      | None             | 150  | None        |                                      |

The bm0008 is the availability zone matching the OpenStack node known as yopo. Note that euca-create-volume which is an EC2 command reports the volume id as an exadecimal number but nova volume-list shows it as a decimal number. The hexadecimal form is used to name the LV volumes of the LVM backend. A partition table is then created on the 150GB volume and configured to have a single primary partition taking all the space.

yopo: kpartx -av /dev/vg/volume-0000003f
yopo: mkfs.ext3 /dev/mapper/vg-volume--0000003f1
yopo: mount /dev/mapper/vg-volume--0000003f1 /mnt
yopo: rsync -i --exclude=/etc/fstab --exclude=70-persistent-net.rules \
 --exclude=/boot/grub \
 --exclude=/srv/backup \
  --exclude=/var/cache \
  --exclude=/var/lib/backuppc \
  --exclude=/var/tmp \
  --exclude=/proc \
  --exclude=/sys -avHS --delete --numeric-ids 10.222.222.1:/ /mnt/

The partition is formatted with ext3 instead of ext4 to avoid any issues : the installed lenny from ns1 only uses ext3. A copy of the ns1 disk is made and excludes files that will either be replaced or contain data that are not worth replicating.

yopo: echo 'proc /proc proc defaults 0 0' > /mnt/etc/fstab
yopo: echo '/dev/vda1 / ext3 defaults,errors=remount-ro 0 1' >> /mnt/etc/fstab

The fstab is rewritten entirely to take into account the presence of a single partition ( as opposed to seven on ns1 ) and a device name starting with /dev/vd instead of /dev/hd or /dev/sd.

yopo: cp /mnt/boot/vmlinuz-2.6.26-2-vserver-686-bigmem /tmp
yopo: cp /mnt/boot/initrd.img-2.6.26-2-vserver-686-bigmem /tmp
yopo: umount
yopo: kpartx -dv /dev/vg/volume-0000003f
yopo: sed -i -e 's:kopt=.*:kopt=root=/dev/vda1' \
 -e 's/default=.*/default=0/' \
 -e 's/groot=.*/groot=(hd0,0)/' /boot/grub/menu.lst
yopo: echo '(hd0) /dev/vda' > /mnt/boot/grub/device.map
yopo: kvm -m 1024 -drive file=/dev/mapper/vg-volume--0000003f,if=virtio,index=0 \
  -boot c -initrd /tmp/initrd.img-2.6.26-2-vserver-686-bigmem\
   -kernel /tmp/vmlinuz-2.6.26-2-vserver-686-bigmem -append 'root=/dev/vda1' \
  -net nic -net user -nographic -curses -monitor unix:/tmp/file.mon,server,nowait
curses: grub-install /dev/vda
curses: update-grub
curses: halt

Grub is installed on the disk by using kvm to actually boot the instance, using a curses based console instead of a VGA console. The grub menu is edited to update the menu.lst and the device.map to reflect the changes with the disk and the partition table. The kernel and initrd are copied out of the file system imported from ns1 to be given as arguments to kvm to allow it to boot under conditions that are close to the one existing on the legacy hardware. Once the machine is successfully booted, grub-install and update-grub are called to allow kvm to boot without an external kernel. It can be verified with:

yopo: kvm -m 1024 -drive file=/dev/mapper/vg-volume--0000003f,if=virtio,index=0 \
  -boot c -net nic -net user -nographic \
  -curses -monitor unix:/tmp/file.mon,server,nowait

Routing the public IP

The legacy installation for ns1 does not obtain its IP address from DHCP and may contain a number of occurrence of this IP in various configuration files. The OpenStack node is configured to add a route dedicated to this IP by adding the following to /etc/rc.local.

brctl addbr br2004
ip link set br2004 up
ip r add 88.191.240.4/32 dev br2004

The br2004 bridge is dedicated to the tenant used to run the OpenStack instance, as shown by 2004 :

desktop: keystone tenant-list | grep ' april '
| 7c918c873280465da3785f5699d48316 | april           | True    |
desktop: nova-manage network list | grep 7c918c873280465da3785f5699d48316
5 10.145.4.0/24 None 10.145.4.3 None None 2004 7c918c873280465da3785f5699d48316 20941588-2c35-40b3-9ecb-af87cadae446

The bridge can be created before OpenStack runs so that the public IP can be routed to it. The existing router will be used by OpenStack.

Migrating

The rsync command shown is run to update copy, without stoping any service.

yopo: ssh 10.222.222.1 ip addr del 88.191.250.4/27 dev eth0
yopo: ssh 10.222.222.1 /etc/init.d/util-vserver stop

The rsync command is run again after stopping all vservers on ns1 and removing the IP from the interface.

yopo: umount /mnt
yopo: kpartx -dv /dev/vg/volume-0000003f
desktop: ssh controller.vm.april-int nova boot \
 --image 'CirrOS 0.3' \
 --block_device_mapping vda=63::0:0 \
 --flavor e.1-cpu.0GB-disk.8GB-ram \
 --key_name loic --availability_zone=bm0008 ns1 --poll

The partition is unmounted and the instance booted from the volume. It should recover as if a power failure happened.

yopo: ip r add 88.191.240.4/32 dev br2004

After the public IP is routed to the bridge br2004 to which the newly created instance is connected, the services should be up and communicated properly.

Setup the arp proxy

The interface of the OpenStack node that is used for floating IPs must be configured as an arp proxy.

echo 1 > /proc/sys/net/ipv4/conf/eth0/proxy_arp
echo 1 > /proc/sys/net/ipv4/conf/br2004/proxy_arp

These lines are appended to /etc/rc.local so that they are run at boot time. The switch to which both machines were connected has an arp cache. It needs to be cleared so that it notices that packets must be sent to another MAC.

ip addr add 88.191.240.4/32 dev eth0
arping -U 88.191.240.4 -I eth0
ip addr del 88.191.240.4/32 dev eth0