Contribution to paste3d.net

A draft freemium business plan was written for paste3d assuming users would be willing to pay to get more resources. A backup infrastructure was setup to automate daily backups and send warnings when the disk is almost full. A debian package was drafted to ease deployment and tests were added to the primary backend script to prevent regressions.

Backups

The rsnapshot utility creates rsync based backups that are verbatim copies of the machine (excluding /proc and /sys). A 50GB LVM volume was created on a 2TB USB drive to hold periodic backups of paste3d.net. The backup machine ( ssh://tothere.tld/ ) and the paste3d.net virtual machine ( ssh://paste3d.dachary.vm.gnt/ ) are connected to an OpenVPN. Cedric Pinson and Loic Dachary have access to it and can run the backups manually. The backups are mounted as follows:

mount /dev/backup-2010-03-24/paste3d.dachary.vm.gnt /mnt/paste3d.dachary.vm.gnt

In order to automate the backups, rsnapshot is run from cron:

CONF=/mnt/paste3d.dachary.vm.gnt/rsnapshot.conf

# 0 */4         * * *           root    /usr/bin/rsnapshot hourly
30 3    * * *           root    /usr/bin/rsnapshot -c $CONF daily
0  3    * * 1           root    /usr/bin/rsnapshot -c $CONF weekly
30 2    1 * *           root    /usr/bin/rsnapshot -c $CONF monthly

The rsnapshot.conf defines how many backups are kepts and for how long:

interval        daily   7
interval        weekly  4
interval        monthly 12

Meaning there will be daily backups for the past seven days. Older backups will be kept once per week, up to four. And after four weeks only one backup a month for a full year. After four month, the /mnt/paste3d.dachary.vm.gnt/ directory will contain these directories:

daily.0  daily.1  daily.2  daily.3  daily.4  daily.5  daily.6
weekly.0  weekly.1  weekly.2
monthly.0  monthly.1  monthly.2

Yesterday backup is in daily.0. The two month old backup is in monthly.1 and so on. The content of each directory is the root of the file system:

ls /mnt/paste3d.dachary.vm.gnt/daily.0/
bin  boot  etc  home  initrd.img  lib  lib64  lost+found  media  mnt  opt  root  sbin  selinux  srv  tmp  usr  var  vmlinuz

A nagios daemon is available on the VPN at http://nagios.pokersource.vm.gnt/ and has been configured to monitor the space left on the paste3d partition using the following command:

define command{
        command_name    check_drbd
        command_line    /usr/lib/nagios/plugins/check_by_ssh -H $HOSTADDRESS$ -l root -C "/usr/local/bin/check_drbd -d $ARG1$"
}

in the following stanza:

define service {
  host_name             tothere.tld
  service_description   paste3d.dachary.vm.gnt backup
  check_command         check_backup!/mnt/paste3d.dachary.vm.gnt
  use                   generic-service
  contacts              mornifle, loic
}

There should also be a check on the sanity of the backup : if rsnapshot fails for some reason, it won’t raise a visible alert. Installing munin would provide even more resource checking in addition to graphs of their evolution. But this will be done at a later time.

Debian package

paste3d is at a very early stage and a debian package has been created for the sole purpose of archiving the files from /var/www/pasted. It is made as a native debian package. The package is built with

cd /var/www/debian
dpkg-buildpackage -uc -us

and the result is in http://paste3d.dachary.org/debian/. In order to add
this directory to the /etc/apt/sources.list repositories, a Makefile was created with the following default rule:

all:
dpkg-scanpackages . /dev/null | gzip > Packages.gz
dpkg-scansources . /dev/null | gzip > Sources.gz

which allows for:

deb http://paste3d.dachary.org/debian ./
deb-src http://paste3d.dachary.org/debian ./

to be added to /etc/apt/sources.list so that:

apt-get update
apt-get install paste3d

installs the package.

Business Model

If a Freemium business model was to be applied to paste3d, it could be modeled using the Giff Constable template. The template has been modified as follows (references are sheet name / cell coordinates ) :

  • Assumptions.D6: $15 monthly subscription.
  • Assumptions.H6: 50% affiliate revenue share. This cell was added, assuming the marketing model is purely thru affiliation. An a user subscription purchased $15, 50% goes to the affiliate. This is probably fit for any online site developped by someone whose primary skill is technical and not marketing.
  • Assumptions.D102: there is a need for customer support every 10,000 active users.
  • Staff.D16: the salary of the operation is just one person, paid $40k until the business becomes profitable and $80k afterwards.
  • Staff.F55: the technical person is the only full time employee.
  • Model.D192: the customer acquisition cost is subtracted from the user revenue

Shell script testing

Except for shunit there is no test framework for shell scripts. It can be achieved simply by encapsulating the commands into functions as was done for main.sh. When called with four arguments instead of three, the test suite is run. It begins with the creation of temporary directory that are to be used for test purposes. When a shell script is properly written (as main.sh was), the paths are not hard coded but stored into variables. The test function can override the default values to use the temporary directories. The main.sh only had to be modified to use return 1 instead of exit 1 so that the test do not exit when trying to simulate an error. There already were test data ready to use to simulate all sorts of import. The tests could be improved to create more error conditions (corrupted files or resource exhaustion for instance.