HOWTO OpenStack Grizzly and Ceph with Puppet on Ubuntu 12.04

For months I’ve asked people working with puppet modules on a daily basis for a HOWTO that I could follow to setup a new cluster with the Grizzly OpenStack release. Such a HOWTO is not needed for people who develop the modules or deploy OpenStack for a living. It is however very helpful for the casual system administrator willing to get it running in a few hours, all by herself/himself.
The packstack seems to be exactly that : a walkthru of a well tested procedure that anyone with a basic understanding of what OpenStack is can rely on. It requires an RPM based distribution and this may be a significant effort for someone used to DEB based operating systems.
For Ubuntu users, the kickstack project was started in summer 2013 and targets hands on sessions, with the declared goal to make it easy for people new to both OpenStack and puppet. Later on, it inspired Dan Bode to use a new approach based on dependency injection to implement openstack-installer for Cisco.
The proposed HOWTO uses openstack-installer to deploy OpenStack against an existing Ceph cluster and provides:

  • keystone
  • nova ( kvm )
  • quantum ( openvswitch + gre )
  • cinder ( Ceph backend )
  • horizon
  • glance ( Ceph backend )

Continue reading “HOWTO OpenStack Grizzly and Ceph with Puppet on Ubuntu 12.04”

OpenStack + Ceph on junk hardware and hazardous hosting

OpenStack is installed with puppet and configured to use a Ceph cluster installed with ceph-deploy. The hardware is composed of about 25 heterogeneous HP machines that are over three years old. Five of them have been racked in a basement that not as dry as it should be. The 100Mb/s Internet connection uses a 30m category 5 cable going through two holes in the walls before reaching the rack.
The cluster will be connected to other similar clusters to reduce the risk of loosing data.

Junk hardware, in a basement

Continue reading “OpenStack + Ceph on junk hardware and hazardous hosting”

setting up an openstack-installer test environment

openstack-installer is a data oriented replacement of puppet-openstack. The following HOWTO runs some basic tests on vagrant virtual machines that are preserved for introspection with:

# vagrant status
control_basevm           running
# vagrant ssh control_basevm
vagrant@control-server:~$ ps -ax | grep keystone
15020 ?        Ss     0:01 /usr/bin/python /usr/bin/keystone-all

The control_basevm runs the horizon dashboard:

Continue reading “setting up an openstack-installer test environment”

HOWTO install Ceph teuthology on OpenStack

Teuthology is used to run Ceph integration tests. It is installed from sources and will use newly created OpenStack instances as targets:

$ cat targets.yaml
targets:
  ubuntu@target1.novalocal: ssh-rsa AAAAB3NzaC1yc2...
  ubuntu@target2.novalocal: ssh-rsa AAAAB3NzaC1yc2...

They allow password free ubuntu ssh connection with full sudo privileges from the machine running teuthology. A Ubuntu precise 12.04.2 target must be configured with:

$ wget -q -O- 'https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc' | \
  sudo apt-key add -
$ echo '    ubuntu hard nofile 16384' > /etc/security/limits.d/ubuntu.conf

It can then be tried with a configuration file that does nothing but install Ceph and run the daemons.

$ cat noop.yaml
check-locks: false
roles:
- - mon.a
  - osd.0
- - osd.1
  - client.0
tasks:
- install:
   project: ceph
   branch: stable
- ceph:

The output should look like this:

$ ./virtualenv/bin/teuthology targets.yaml noop.yaml
INFO:teuthology.run_tasks:Running task internal.save_config...
INFO:teuthology.task.internal:Saving configuration
INFO:teuthology.run_tasks:Running task internal.check_lock...
INFO:teuthology.task.internal:Lock checking disabled.
INFO:teuthology.run_tasks:Running task internal.connect...
INFO:teuthology.task.internal:Opening connections...
DEBUG:teuthology.task.internal:connecting to ubuntu@teuthology2.novalocal
DEBUG:teuthology.task.internal:connecting to ubuntu@teuthology1.novalocal
...
INFO:teuthology.run:Summary data:
{duration: 363.5891010761261, flavor: basic, owner: ubuntu@teuthology, success: true}
INFO:teuthology.run:pass

Continue reading “HOWTO install Ceph teuthology on OpenStack”

Ceph early adopter : Université de Nantes

Loire Chantrerie Lombarderie
The Université de Nantes started using Ceph for backups early 2012, before the Bobtail was released or Inktank founded. The IRTS department, under the lead of Yann Dupont, created a twelve nodes Ceph cluster to store backups. It contains the data generated by 35,000 students and 4,500 employees totaling 100 millions inodes and 25TB of data (out of 40TB). The hardware is spread accross three geographical locations ( Loire, Chantrerie and Lombarderie ) and Ceph is configured to keep working transparently even when one of them is down. The backup pool has two replicas and the crushmap states that each must be stored in a different geographical location. For instance, when Lombarderie is unreachable, which happened this week because of a planned power outage combined with an unplanned UPS failure, Ceph keeps serving the objects from the replicas located in Loire and Chantrerie.
Continue reading “Ceph early adopter : Université de Nantes”

GLOCK is my favorite Cloud stack

GLOCK stands for GNU, Linux, OpenStack, Ceph and KVM. GNU is the free Operating System that guarantees my freedom and independance, Linux is versatile enough to accommodate for the heterogeneous hardware I’m using, OpenStack allows me to cooperatively run a IaaS with my friends and the non-profits I volunteer for, Ceph gives eternal life to my data and KVM will be maintained for as long as I live.

Using the largest OpenStack tenant to define an architecture that scales out

The service offering of public cloud providers is designed to match many potential customers. It would be impossible to design the underlying architecture ( hardware and software ) to accommodate the casual individual as well as the need of the CERN. No matter how big the cloud provider, there are users for whom it is both more cost effective and efficient to design a private cloud.

The size of the largest user could be used to simplify the architecture and resolve bottlenecks when it scales. The hardware and software are designed to create a production unit that is N times the largest user. For instance, if the largest user requires 1PB of storage, 1,000 virtual machines and 10Gb/s of sustained internet transit, the production unit could be designed to accommodate a maximum of N = 10 users of this size. If the user base grows but the size of the largest user does not change, independent production units are built. All production units have the same size and can be multiplied.

Each production unit is independent from the others and can operate standalone. A user confined in a production unit does not require interactions, directly or indirectly, with other production units. While this is true most of the time, a live migration path must be open temporarily between production units to balance their load. For instance, when the existing production units are too full and a production unit becomes operational, some users are migrated to the new production unit.

Although block and instance live migration are supported within an OpenStack cluster, this architecture would require the ability to live migrate blocks and instances between unrelated OpenStack clusters. In the meantime, cells and aggregates can be used. The user expects this migration to happen transparently when the provider does this behind the scene. But if she/he is willing to change from one OpenStack provider to the other, the same mechanism could eventually be used. Once the user ( that is the tenant in the OpenStack parlance ) is migrated, the credentials can be removed.

Virtualizing legacy hardware in OpenStack

A five years old hardware is being decommissioned and hosts fourteen vservers on a Debian GNU/Linux lenny running a 2.6.26-2-vserver-686-bigmem linux kernel. The April non profit relies on these services (mediawiki, pad, mumble, etc. ) for the benefit of its 5,000 members and many working groups. Instead of migrating each vserver individually to an OpenStack instance, it was decided that the vserver host would be copied over to an OpenStack instance.
The old hardware has 8GB of RAM, 150GB disk and a dual Xeon totaling 8 cores. The munin statistics show that no additional memory is needed, the disk is half full and an average of one core is used at all times. A 8GB RAM, 150GB disk and dual core openstack instance is prepared. The instance will be booted from a 150GB volume placed on the same hardware to get maximum disk I/O speed.
After the volume is created, it is mounted from the OpenStack node and the disk of the old machine is rsync’ed to it. It is then booted after modifying a few files such as fstab. The OpenStack node is in the same rack and the same switch as the old hardware. The IP is removed from the interface of the old hardware and it is bound to the OpenStack instance. Because it is running on nova-network with multi-host activated, it is bound to the interface of the OpenStack node which can take over immediately. The public interface of the node is set as an ARP proxy to advertise the bridge where the instance is connected. The security group of the instance are disabled ( by opening all protocols and ports ) because a firewall is running in the instance.
Continue reading “Virtualizing legacy hardware in OpenStack”

OpenStack Upstream University training

Upstream University training for OpenStack contributors include a live session where students contribute to a Lego town. They have to comply with the coding standards imposed by the existing buildings. More than fifteen participants created an impressive city within a few hours during the session held in may 2013. The images speak for themselves. The next sessions will be in Paris in June and Portland in July.

Continue reading “OpenStack Upstream University training”

Disaster recovery on host failure in OpenStack

The host bm0002.the.re becomes unavailable because of a partial disk failure on an Essex based OpenStack cluster using LVM based volumes and multi-host nova-network. The host had daily backups using rsync / and each LV was copied and compressed. Although the disk is failing badly, the host is not down and some reads can still be done. The nova services are shutdown, the host disabled using nova-manage and an attempt is made to recover from partially damaged disks and LV, when it leads to better results than reverting to yesterday’s backup.
Continue reading “Disaster recovery on host failure in OpenStack”