OpenStack Upstream Training in Atlanta

The OpenStack Foundation is delivering a training program to accelerate the speed at which new OpenStack developers are successful at integrating their own roadmap into that of the OpenStack project.  If you’re a new OpenStack contributor or plan on becoming one soon, you should sign up for the next OpenStack Upstream Training in Atlanta, May 10-11. Participation is strongly advised also for first time participants to OpenStack Design Summit.

Continue reading “OpenStack Upstream Training in Atlanta”

Becoming a Core Contributor : the fast track

Anyone willing to become a better Free Software contributor is invited to attend the next session of Upstream University in advance of FOSDEM. The training starts January 30th, 2014 in the morning, at a walking distance from Grand Place in Brussels.

Participating in Free Software projects is not just about technical skills : there will be informal followups in bars and restaurants afterwards 🙂 This session will be the first to focus on Core Contributors and what it takes to become one, based on lessons learnt from OpenStack and Ceph.
Continue reading “Becoming a Core Contributor : the fast track”

wget on an OpenStack instance hangs ? Try lowering the MTU

Why would OpenStack instances fail to wget a URL and work perfectly on others ? For instance:

$ wget -O - 'https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/autobuild.asc'
Connecting to ceph.com (ceph.com)|208.113.241.137|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/plain]
Saving to: `STDOUT'

    [<=>                                                                           ] 0           --.-K/s              ^

If it can be fixed by lowering the MTU from the default of 1500 to 1400 with:

$ sudo ip link set mtu 1400 dev eth0
$ sudo ip link show dev eth0
2: eth0:  mtu 1400 qdisc pfifo_fast state UP qlen 1000
    link/ether fa:16:3e:85:ee:a5 brd ff:ff:ff:ff:ff:ff

it means the underlying OpenStack DHCP should be fixed to set the MTU to 1400.
Continue reading “wget on an OpenStack instance hangs ? Try lowering the MTU”

Mixing Ceph and LVM volumes in OpenStack

Ceph pools are defined to collocate volumes and instances in OpenStack Havana. For volumes that do not need the resilience provided by Ceph, a LVM cinder backend is defined in /etc/cinder/cinder.conf:

[lvm]
volume_group=cinder-volumes
volume_driver=cinder.volume.drivers.lvm.LVMISCSIDriver
volume_backend_name=LVM

and appended to the list of existing backends:

enabled_backends=rbd-default,rbd-ovh,rbd-hetzner,rbd-cloudwatt,lvm

A cinder volume type is created and associated with it:

# cinder type-create lvm
+--------------------------------------+------+
|                  ID                  | Name |
+--------------------------------------+------+
| c77552ff-e513-4851-a5e6-2c83d0acb998 | lvm  |
+--------------------------------------+------+
# cinder type-key lvm set volume_backend_name=LVM
#  cinder extra-specs-list
+--------------------------------------+-----------+--------------------------------------------+
|                  ID                  |    Name   |                extra_specs                 |
+--------------------------------------+-----------+--------------------------------------------+
...
| c77552ff-e513-4851-a5e6-2c83d0acb998 |    lvm    |      {u'volume_backend_name': u'LVM'}      |
...
+--------------------------------------+-----------+--------------------------------------------+

To reduce the network overhead, a backend availability zone is defined for each bare metal by adding to /etc/cinder/cinder.conf:

storage_availability_zone=bm0015

and restarting cinder-volume:

# restart cinder-volume
# sleep 5
# cinder-manage host list
host                            zone
...
bm0015.the.re@lvm               bm0015
...

where bm0015 is the hostname of the machine. To create a LVM backed volume that is located on bm0015:

cinder create --availability-zone bm0015 --volume-type lvm --display-name test 1

In order for the allocation of RBD volumes to keep working without specifying an availability zone, there must be at least one cinder volume running in the default availability zone ( nova presumably ) and configured with the expected RBD backends. This can be checked with:

# cinder-manage host list | grep nova
...
bm0017.the.re@rbd-cloudwatt     nova
bm0017.the.re@rbd-ovh           nova
bm0017.the.re@lvm               nova
bm0017.the.re@rbd-default       nova
bm0017.the.re@rbd-hetzner       nova
...

In the above the lvm volume type is also available in the nova availability zone and is used as a catch all when a LVM volume is prefered but collocating it on the same machine as the instance does not matter.

Migrating from ganeti to OpenStack via Ceph

On ganeti, shutdown the instance and activate its disks:

z2-8:~# gnt-instance shutdown nerrant
Waiting for job 1089813 for nerrant...
z2-8:~# gnt-instance activate-disks nerrant
z2-8.host.gnt:disk/0:/dev/drbd10

On an OpenStack Havana installation using a Ceph cinder backend, create a volume with the same size:

# cinder create --volume-type ovh --display-name nerrant 10
+---------------------+--------------------------------------+
|       Property      |                Value                 |
+---------------------+--------------------------------------+
|     attachments     |                  []                  |
|  availability_zone  |                 nova                 |
|       bootable      |                false                 |
|      created_at     |      2013-11-12T13:00:39.614541      |
| display_description |                 None                 |
|     display_name    |              nerrant                 |
|          id         | 3ec2035e-ff76-43a9-bbb3-6c003c1c0e16 |
|       metadata      |                  {}                  |
|         size        |                  10                  |
|     snapshot_id     |                 None                 |
|     source_volid    |                 None                 |
|        status       |               creating               |
|     volume_type     |                 ovh                  |
+---------------------+--------------------------------------+
# rbd --pool ovh info volume-3ec2035e-ff76-43a9-bbb3-6c003c1c0e16
rbd image 'volume-3ec2035e-ff76-43a9-bbb3-6c003c1c0e16':
        size 10240 MB in 2560 objects
        order 22 (4096 KB objects)
        block_name_prefix: rbd_data.90f0417089fa
        format: 2
        features: layering

On a host connected to the Ceph cluster and running a linux-kernel > 3.8 ( because of the format: 2 above ), map to a bloc device with:

# rbd map --pool ovh volume-3ec2035e-ff76-43a9-bbb3-6c003c1c0e16
# rbd showmapped
id pool image                                       snap device
1  ovh  volume-3ec2035e-ff76-43a9-bbb3-6c003c1c0e16 -    /dev/rbd1

Copy the ganeti volume with:

z2-8:~# pv < /dev/drbd10 | ssh bm0014 dd of=/dev/rbd1
2,29GB 0:09:14 [4,23MB/s] [==========================>      ] 22% ETA 0:31:09

and unmap the device when it completes.

rbd unmap /dev/rbd1

The volume is ready to boot.

Collocating Ceph volumes and instances in a multi-datacenter setup

OpenStack Havana is installed on machines rented from OVH and Hetzner. An aggregate is created for machines hosted at OVH and another for machines hosted at Hetzner. A Ceph cluster is created with a pool using disks from OVH and another pool using disks from Hetzner. A cinder backend is created for each Ceph pool. From the dashboard, an instance can be created in the OVH availability zone using a Ceph volume provided by the matching OVH pool.

Continue reading “Collocating Ceph volumes and instances in a multi-datacenter setup”

Fragmented floating IP pools and multiple AS hack

When an OpenStack Havana cluster is deployed on hardware rented from OVH and Hetzner, IPv4 are rented by the month and are either isolated ( just one IP, not a proper subnet ) or made of a collection of disjoint subnets of various sizes.

91.121.254.238/32
188.165.144.248/30
...

OpenStack does not provide a way to deal with this situation and a hack involving a double nat using a subnet of floating IP is proposed.
A L3 agent runs on an OVH machine and pretends that 10.88.15.0/24 is a subnet of floating IPs, although they are not publicly available. Another L3 agent is setup on a Hetzner machine and uses the 10.88.16.0/24 subnet.
When an instance is created, it may chose a Hetzner private subnet, which is connected to a Hetzner router for which the gateway has been set to a network providing the Hetzner floating IPs. And the same is done for OVH.
A few floating IP are rented from OVH and Hetzner. On the host running the L3 agent dedicated to the OVH AS, a 1 to 1 nat is established between each IP in the 10.88.15.0/24 subnet and the OVH floating IPs. For instance the following /etc/init/nat.conf upstart script associates 10.88.15.3 with the 91.121.254.238 floating IP.

description "OVH nat hack"

start on neutron-l3-agent

script
  iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE
  ip addr add 10.88.15.1/24 dev br-ex
  while read private public ; do
    test "$public" || continue
    iptables -t nat -A POSTROUTING -s $private/32 -j SNAT --to-source $public
    iptables -t nat -A PREROUTING -d $public/32 -j DNAT --to-destination $private
  done <<EOF
10.88.15.3      91.121.254.238
EOF
end script

Continue reading “Fragmented floating IP pools and multiple AS hack”