Computation can be co-located on the machine where a Ceph object resides and access it from the local disk instead of going through the network. Noah Watkins explains it in great detail and it can be experimented with a Hello World example which calls the hello plugin included in the Emperor release.
Continue reading “Hadoop like computing with Ceph”
Organization mapping and Reviewed-by statistics with git
shortlog is convenient to print a leader board counting contributions. For instance to display the top ten commiters of Ceph over the past year:
$ git shortlog --since='1 year' --no-merges -nes | nl | head -10 1 1890 Sage Weil <sage@inktank.com> 2 805 Danny Al-Gaaf <danny.al-gaaf@bisect.de> 3 491 Samuel Just <sam.just@inktank.com> 4 462 Yehuda Sadeh <yehuda@inktank.com> 5 443 John Wilkins <john.wilkins@inktank.com> 6 303 Greg Farnum <greg@inktank.com> 7 288 Dan Mick <dan.mick@inktank.com> 8 274 Loic Dachary <loic@dachary.org> 9 219 Yan, Zheng <zheng.z.yan@intel.com> 10 214 João Eduardo Luís <joao.luis@inktank.com>
To get the same output for reviewers over the past year, assuming the Reviewed-by is set consistently in the commit messages, the following can be used:
git log --since='1 year' --pretty=%b | \ perl -n -e 'print "$_\n" if(s/^\s*Reviewed-by:\s*(.*<.*>)\s*$/\1/)' | \ git check-mailmap --stdin | \ sort | uniq -c | sort -rn | nl | head -10 1 652 Sage Weil <sage@inktank.com> 2 265 Greg Farnum <greg@inktank.com> 3 185 Samuel Just <sam.just@inktank.com> 4 106 Josh Durgin <josh.durgin@inktank.com> 5 95 João Eduardo Luís <joao.luis@inktank.com> 6 95 Dan Mick <dan.mick@inktank.com> 7 69 Yehuda Sadeh <yehuda@inktank.com> 8 46 David Zafman <david.zafman@inktank.com> 9 36 Loic Dachary <loic@dachary.org> 10 21 Gary Lowell <gary.lowell@inktank.com>
The body of the commit messages ( –pretty=%b ) is displayed for commits from the past year ( –since=’1 year’ ). perl reads an does not print anything ( -n ) unless it finds a Reviewed-by: string followed by what looks like First Last <mail@dot.com> ( ^\s*Reviewed-by:\s*(.*<.*>)\s*$ ). The authors found are remapped to fix typos ( git check-mailmap –stdin ).
The authors can further be remapped into the organization to which they are affiliated using the .organizationmap file which has the same format as the .mailmap file, only remapping normalized author names to organization names with git -c mailmap.file=.organizationmap check-mailmap –stdin
git log --since='1 year' --pretty=%b | \ perl -n -e 'print "$_\n" if(s/^\s*Reviewed-by:\s*(.*<.*>)\s*$/\1/)' | \ git check-mailmap --stdin | \ git -c mailmap.file=.organizationmap check-mailmap --stdin | \ sort | uniq -c | sort -rn | nl | head -10 1 1572 Inktank <contact@inktank.com> 2 39 Cloudwatt <libre.licensing@cloudwatt.com> 3 7 Intel <contact@intel.com> 4 4 University of California, Santa Cruz <contact@cs.ucsc.edu> 5 4 Roald van Loon Consultancy <roald@roaldvanloon.nl> 6 2 CERN <contact@cern.ch> 7 1 SUSE <contact@suse.com> 8 1 Mark Kirkwood <mark.kirkwood@catalyst.net.nz> 9 1 IWeb <contact@iweb.com> 10 1 Gaudenz Steinlin <gaudenz@debian.org>
Becoming a Core Contributor : the fast track
Anyone willing to become a better Free Software contributor is invited to attend the next session of Upstream University in advance of FOSDEM. The training starts January 30th, 2014 in the morning, at a walking distance from Grand Place in Brussels.
- Registration is free and requires to pick a contribution to work on in the bug tracker of a Free Software project (it can be any Free Software project)
Participating in Free Software projects is not just about technical skills : there will be informal followups in bars and restaurants afterwards 🙂 This session will be the first to focus on Core Contributors and what it takes to become one, based on lessons learnt from OpenStack and Ceph.
Continue reading “Becoming a Core Contributor : the fast track”
Exploring Ceph cache pool implementation
Sage Weil and Greg Farnum presentation during the Firefly Ceph Developer Summit in 2013 is used as an introduction to the cache pool that is being implemented for the upcoming Firefly release.
The CEPH_OSD_OP_COPY_FROM etc.. rados operations have been introduced in Emperor and tested by ceph_test_rados which is used by teuthology for integration tests by doing COPY_FROM and COPY_GET at random.
After a cache pool has been defined using the osd tier commands, objects can be promoted to the cache pool ( see the corresponding test case ).
The HitSets keep track of which object have been read or written ( using bloom filters ).