ceph internals : buffer lists

The ceph buffers are used to process data in memory. For instance, when a FileStore handles an OP_WRITE transaction it writes a list of buffers to disk.

                                             +---------+
                                             | +-----+ |
    list              ptr                    | |     | |
 +----------+       +-----+                  | |     | |
 | append_  >------->     >-------------------->     | |
 |  buffer  |       +-----+                  | |     | |
 +----------+                        ptr     | |     | |
 |   _len   |      list            +-----+   | |     | |
 +----------+    +------+     ,--->+     >----->     | |
 | _buffers >---->      >-----     +-----+   | +-----+ |
 +----------+    +----^-+     \      ptr     |   raw   |
 |  last_p  |        /         `-->+-----+   | +-----+ |
 +--------+-+       /              +     >----->     | |
          |       ,-          ,--->+-----+   | |     | |
          |      /        ,---               | |     | |
          |     /     ,---                   | |     | |
        +-v--+-^--+--^+-------+              | |     | |
        | bl | ls | p | p_off >--------------->|     | |
        +----+----+-----+-----+              | +-----+ |
        |               | off >------------->|   raw   |
        +---------------+-----+              |         |
              iterator                       +---------+

The actual data is stored in buffer::raw opaque objects. They are accessed through a buffer::ptr. A buffer::list is a sequential list of buffer::ptr which can be used as if it was a contiguous data area although it can be spread over many buffer::raw containers, as represented by the rectangle enclosing the two buffer::raw objects in the above drawing. The buffer::list::iterator can be used to walk each character of the buffer::list as follows:

  bufferlist bl;
  bl.append("ABC", 3);
  {
    bufferlist::iterator i(&bl);
    ++i;
    EXPECT_EQ('B', *i);
  }

Continue reading “ceph internals : buffer lists”

Upstream University at the OpenStack summit

Upstream University What if contributing to OpenStack was made a lot easier by a few days of training? You could get this training at Upstream University, which was created shortly after the OpenStack design summit, in April 2012, with this sole goal of improving developers’ contribution skills. Upstream University has since coached new OpenStack contributors, from eNovance and Cloudwatt, developers; for the kernel Linux and many others. OpenStack
To celebrate its first year, Upstream University is organizing a session in advance of the next OpenStack summit, in Portland. If you can fly in two days ahead of the event to spend the weekend improving your OpenStack contribution skills, please consider submitting an application to attend the workshop. This a one-time offer for free training.
Continue reading “Upstream University at the OpenStack summit”

Chaining extended attributes in ceph

Ceph uses extended file attributes to store file meta data. It is a list of key / value pairs. Some file systems implementations do not allow to store more than 2048 characters in the value associated with a key. To overcome this limitation Ceph implements chained extended attributes.
A value that is 5120 character long will be stored in three separate attributes:

  • user.key : first 2048 characters
  • user.key@1 : next 2048 characters
  • user.key@2 : last 1024 characters

The proposed unit tests may be used as a documentation describing in detail how it is implemented from the caller point of view.
Continue reading “Chaining extended attributes in ceph”