The host bm0002.the.re becomes unavailable because of a partial disk failure on an Essex based OpenStack cluster using LVM based volumes and multi-host nova-network. The host had daily backups using rsync / and each LV was copied and compressed. Although the disk is failing badly, the host is not down and some reads can still be done. The nova services are shutdown, the host disabled using nova-manage and an attempt is made to recover from partially damaged disks and LV, when it leads to better results than reverting to yesterday’s backup.
Continue reading “Disaster recovery on host failure in OpenStack”
Minimal DNS spoofing daemon
When running tests in a controlled environment, it should be possible to spoof the domain names. For instance foo.com could be mapped into slow.novalocal, an OpenStack instance responding very slowly to simulate timeouts. A twisted based spoofing DNS reverse proxy is implemented to transparently resolve domain names with other domain names IP addresses, using a python hash table such as:
fqdn2fqdn = { 'foo.com': 'foo.me', 'bar.com': 'bar.me', }
It will map foo.com to foo.me as follows:
$ sudo python dns_spoof.py 8.8.8.8 & $ ping -c 1 foo.me PING foo.me (91.185.200.115) 56(84) bytes of data. 64 bytes from 91.185.200.115: icmp_req=1 ttl=47 time=42.2 ms --- foo.me ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 42.268/42.268/42.268/0.000 ms $ ping -c 1 foo.com PING foo.com (91.185.200.115) 56(84) bytes of data. 64 bytes from 91.185.200.115: icmp_req=1 ttl=47 time=42.2 ms --- foo.com ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 42.290/42.290/42.290/0.000 ms
Update May 10, 2013: an easier solution is to configure your BIND resolvers to lie using Response Policy Zones (RPZ). Thanks to S. Bortzmeyer for pointing in the right direction.
Continue reading “Minimal DNS spoofing daemon”
nova-network debugging tips
A single machine is installed with Debian GNU/Linux OpenStack Folsom. Four instances are created and it turns out that nova-network is configured with the wrong public interface. It can be fixed without shutting down the instance:
nova suspend target1
The instance is suspended to disk (as if it was a laptop) and the corresponding KVM process is killed. While the instance is suspended, nova-network can be stopped.
/etc/init.d/nova-network stop
The source of the problem was a typo in the public interface leading to an incorrect VLAN interface
13: vlan100@eth2:mtu 1500 qdisc noqueue state DOWN mode DEFAULT link/ether fa:16:3e:54:5b:57 brd ff:ff:ff:ff:ff:ff
it can be fixed in the /etc/nova/nova.conf configuration file at the line:
public_interface = eth3
The incorrect VLAN interface is manually deleted and nova-network can be restarted. The instance is then resumed with
nova resume target1
and nova-network will automatically re-create the VLAN interface.
Continue reading “nova-network debugging tips”
ceph internals : buffer lists
The ceph buffers are used to process data in memory. For instance, when a FileStore handles an OP_WRITE transaction it writes a list of buffers to disk.
+---------+ | +-----+ | list ptr | | | | +----------+ +-----+ | | | | | append_ >-------> >--------------------> | | | buffer | +-----+ | | | | +----------+ ptr | | | | | _len | list +-----+ | | | | +----------+ +------+ ,--->+ >-----> | | | _buffers >----> >----- +-----+ | +-----+ | +----------+ +----^-+ \ ptr | raw | | last_p | / `-->+-----+ | +-----+ | +--------+-+ / + >-----> | | | ,- ,--->+-----+ | | | | | / ,--- | | | | | / ,--- | | | | +-v--+-^--+--^+-------+ | | | | | bl | ls | p | p_off >--------------->| | | +----+----+-----+-----+ | +-----+ | | | off >------------->| raw | +---------------+-----+ | | iterator +---------+
The actual data is stored in buffer::raw opaque objects. They are accessed through a buffer::ptr. A buffer::list is a sequential list of buffer::ptr which can be used as if it was a contiguous data area although it can be spread over many buffer::raw containers, as represented by the rectangle enclosing the two buffer::raw objects in the above drawing. The buffer::list::iterator can be used to walk each character of the buffer::list as follows:
bufferlist bl; bl.append("ABC", 3); { bufferlist::iterator i(&bl); ++i; EXPECT_EQ('B', *i); }
Upstream University at the OpenStack summit
What if contributing to OpenStack was made a lot easier by a few days of training? You could get this training at Upstream University, which was created shortly after the OpenStack design summit, in April 2012, with this sole goal of improving developers’ contribution skills. Upstream University has since coached new OpenStack contributors, from eNovance and Cloudwatt, developers; for the kernel Linux and many others.
To celebrate its first year, Upstream University is organizing a session in advance of the next OpenStack summit, in Portland. If you can fly in two days ahead of the event to spend the weekend improving your OpenStack contribution skills, please consider submitting an application to attend the workshop. This a one-time offer for free training.
Continue reading “Upstream University at the OpenStack summit”
Chaining extended attributes in ceph
Ceph uses extended file attributes to store file meta data. It is a list of key / value pairs. Some file systems implementations do not allow to store more than 2048 characters in the value associated with a key. To overcome this limitation Ceph implements chained extended attributes.
A value that is 5120 character long will be stored in three separate attributes:
- user.key : first 2048 characters
- user.key@1 : next 2048 characters
- user.key@2 : last 1024 characters
The proposed unit tests may be used as a documentation describing in detail how it is implemented from the caller point of view.
Continue reading “Chaining extended attributes in ceph”
unit testing ceph : the Throttle.cc example
The throttle implementation for ceph can be unit tested using threads when it needs to block. The gtest framework produces coverage information to lcov showing that 100% of the lines of code are covered.
Continue reading “unit testing ceph : the Throttle.cc example”
ceph code coverage (part 2/2)
WARNING: teuthology has changed significantly, the instructions won’t work anymore.
When running ceph integration tests with teuthology, code coverage reports shows which lines of code were involved. Adding coverage: true to the integration task and using code compiled for code coverage instrumentation with flavor: gcov collects coverage data. lcov is then used
./virtualenv/bin/teuthology-coverage -v --html-output /tmp/html ...
to create an HTML report. It shows that lines 217 and 218 of mon/Monitor.cc are not being used by the scenario.
Installing OpenStack Folsom on Debian GNU/Linux wheezy
Installing and testing OpenStack Folsom on a virgin Debian GNU/Linux wheezy takes less than one hour. A set of packages is archived to make sure it keeps working. After checking the pre-requisites such as a public and private interface, the packages are installed and debconf questions answered as instructed.
The networks must then be created with
nova-manage network create private --fixed_range_v4=10.20.0.0/16 \ --network_size=256 --num_networks=2 --vlan=100
/etc/nova/nova.conf is updated to set vlan_interface=dummy0, public_interface=eth0 and fixed_range=10.20.0.0/16. /etc/nova/nova-compute.conf is updated to use LibvirtBridgeDriver and an instance can be booted with:
nova boot --poll --flavor m1.tiny --image cirrOS-0.3.0-x86_64 \ --key_name loic test
Continue reading “Installing OpenStack Folsom on Debian GNU/Linux wheezy”
ceph code coverage (part 1/2)
The ceph sources are compiled with code coverage enabled
root@ceph:/srv/ceph# ./configure --with-debug CFLAGS='-g' CXXFLAGS='-g' \ --enable-coverage \ --disable-silent-rules
and the tests are run
cd src ; make check-coverage
to create the HTML report which shows where tests could improve code coverage:
Continue reading “ceph code coverage (part 1/2)”