Merging Ceph placement group logs

Ceph stores objects in pools which are divided in placement groups. When an object is created, modified or deleted, the placement group logs the operation. If an OSD supporting the placement group is temporarily unavailable, the logs are used for recovery when it comes back.

The column to the left shows the log entries. They are ordered, oldest first. For instance : 1,4 is epoch 1 version 4. The tail displayed to the left points to the oldest entry of the log. The head points to the most recent. The column to the right shows the olog : they are authoritative and will be merged into log which may imply to discard divergent entries. All log entries relate to an object which is represented by its hash in the column in the middle : x5, x7, x9.

The use cases implemented by PGLog::merge_log are demonstrated by individual test cases.

Extending the tail of the log

Before extending the tail

After extending the tail

See the test case for more information.

Extending the head of the log

The log entry (1,3) deletes the object x9 but the olog entry (2,3) modifies it and is authoritative : the log entry (1,3) is divergent.

Extending the head with some divergent entries

See the test case for more information.

Divergent head

The head of the log entry (1,5) is divergent because it is greater than the head of olog.

Divergent head

See the test case for more information.

Disjoint logs

If the logs do not overlap, throw an exception.
disjoint-logs

Disjoint logs

See the test case for more information.