SSE optimization for erasure code in Ceph

The jerasure library is the default erasure code plugin of Ceph. The gf-complete companion library supports SSE optimizations at compile time, when the compiler provides them (-msse4.2 etc.). The jerasure (and gf-complete with it) plugin is compiled multiple times with various levels of SSE features:

  • jerasure_sse4 uses SSE4.2, SSE4.1, SSSE3, SSE3, SSE2, SSE
  • jerasure_sse3 uses SSSE3, SSE3, SSE2, SSE
  • jerasure_generic uses no SSE instructions

When an OSD loads the jerasure plugin, the CPU features are probed and the appropriate plugin is selected depending on their availability.
The gf-complete source code is cleanly divided into functions that take advantage of specific SSE features. It should be easy to use the ifunc attribute to semi-manually select each function individually, at runtime and without performance penalty (because the choice is made the first time the function is called and recorded for later calls). With such a fine grain selection, there would be no need to compile three plugins because each function would be compiled with exactly the set of flag it needs.

2 Replies to “SSE optimization for erasure code in Ceph”

  1. Hi,
    are all sse-features are used, or what happens if one is missing?
    Because the FX-8350 has “sse sse2 ssse3 sse4_1 sse4_2 sse4a” but no sse3.

Comments are closed.