Ceph erasure code : ready for alpha testing

The addition of erasure code in Ceph started in april 2013 and was discussed during the first Ceph Developer Summit. The implementation reached an important milestone a few days ago and it is now ready for alpha testing.
For the record, here is the simplest way to store and retrieve an object in an erasure coded pool as of today:

parameters="erasure-code-k=2 erasure-code-m=1"
./ceph osd crush rule create-erasure ecruleset \
  $parameters \
  erasure-code-ruleset-failure-domain=osd
./ceph osd pool create ecpool 12 12 erasure \
  crush_ruleset=ecruleset \
  $parameters
./rados --pool ecpool put SOMETHING /etc/group
./rados --pool ecpool get SOMETHING /tmp/group
$ tail -3 /tmp/group
postfix:x:133:
postdrop:x:134:
_cvsadmin:x:135:

The chunks are stored in three objects and it can be reconstructed if any of them are lost.

find dev | grep SOMETHING
dev/osd4/current/3.7s0_head/SOMETHING__head_847441D7__3_ffffffffffffffff_0
dev/osd6/current/3.7s1_head/SOMETHING__head_847441D7__3_ffffffffffffffff_1
dev/osd9/current/3.7s2_head/SOMETHING__head_847441D7__3_ffffffffffffffff_2

6 Replies to “Ceph erasure code : ready for alpha testing”

Will the erasure code decrease the performance of Ceph compared to the current multi-copy method ？Thank you

Loic Dachary says:

March 13, 2014 at 12:09 pm

Yes, the erasure code pools are going to be significantly slower than the replicated pools. It is one of the tradeoff : slower storage using less space versus faster storage using more space.

Why are the erasure code pools slower than replicated pools?

Loic Dachary says:

December 21, 2014 at 7:29 pm

Because they require more CPU and more cluster network bandwidth, mainly.
1. Per Simonsen says:
  
  January 15, 2015 at 4:29 pm
  
  Ok, thank you for your answer!
  
  Will the replicated pools be faster even if the erasure code is systematic, and there are only reads(no recovery activity or writes)?

We use rados to bench the erasure-code pool and replicated pool, there are no big difference of performance in seq read and write tests. But when we test erasure-code + cache tier in Cephfs way, it apparently slow than repliacted pools.

Comments are closed.