I decided that, while my old rsync/unison backup solution was working fine, I needed something else.
I recently got a netbook and wanted to be able to keep my documents between my laptop and the netbook in sync.
While rsync/unison worked ok, there were some encoding errors in filenames and the "manual" start of the backup always felt kinda strange.
I already had dropbox (ref link) on my laptop and was pretty happy with it. It worked fine but the space is limited and the pro account was too expensive for my taste. I also didn't want to store my files unencrypted in the cloud.

While dropbox offers a headless install for e.g. my v-server, it is pretty annoying that dropbox only has ONE folder that it is able to backup and this folder also always syncs completely.
spideroak logo

I then stumbled upon Spideroak (ref link). The SpiderOak team immediately appeald to my nerdy side by having an engineering matters section on their site, complete with crypto details (2048 byte RSA and 256 bit AES, using a key created by the key derivation/strengthening algorithm pdkdf2 (using sha256), with 16384 rounds, and 32 bytes of salt).
The service is completely cross platform (I'm currently running it on Ubuntu 9.10 x86_64, 9.04 x86 server and 32bit Windows 7).
The allow people to select which folders they want to backup and which of those they want to sync with other computers.
They offer a commandline mode which has a whole bunch of options:


# SpiderOak --help
Usage: SpiderOak basic command line usage:

Options:
  --version             show program's version number and exit
  -h, --help            show this help message and exit
  -v, --verbose         be verbose: show detailed status information as it
                        happens
  -d NUMBER, --device=NUMBER
                        operate on specified device id (default is the local
                        device)
  -o DIR, --output=DIR  Target dir to restore items to (default is normal
                        download dir)
  --enable-schedule     honor the set activity schedule, even in batchmode
                        (normally the schedule is ignored in batchmode)
  --disable-schedule    disable activity scheduling

  Operational Modes and Commands:
    --backup=TARGET     ad hoc operation: backup whatever exists at TARGET in
                        the filesystem and exit (ignores existing backup
                        selection.)
    --restore=item      Restore a folder, file, or version.
                        Run "--restore help" for more info
    --headless          run in headless mode (without the graphical interface)
    --batchmode         like headless, but will exit when all available work
                        is done
    --sync              like batchmode, but only backup/update synced folders
    --scan, --scan-only
                        scan the filesystem for changes and report a summary
    --build, --scan-and-build-only
                        scan the filesystem, and build all possible file
                        system changes as shelved upload transactions, and
                        exit without uploading them
    --merge             merge and restore the contents of multiple paths from
                        arbitrary devices:  dev1:path1 .. devN:pathN
    --purge=item        purge a folder, file (including historical versions)

  Information Commands:
    --userinfo, --user-info
                        Show user and device info
    --space             Show space usage information by category and by device
    --tree              Show the hierarchy of stored backup folders
    --tree-changelog    Show a log of how the hierarchy of stored backup
                        folders has changed over time
    --journal-changelog=folder_or_journal
                        Show the changelog of a given folder
    --shelved-x, --print-shelved-x
                        Show information about each shelved upload transaction
    --fulllist          Show all folders and files stored on device

  Backup Selection Manipulation Commands:
    --selection, --print-selection
                        Show a list of selected and excluded backup items
    --reset-selection   Reset selection (but preserve excluded files)
    --exclude-file=EXCLUDE_FILE
                        Exclude the given file from the selection
    --exclude-dir=EXCLUDE_DIR
                        Exclude the given directory from the selection
    --include-dir=INCLUDE_DIR
                        Include the given directory in the selection
    --force             Do in/exclusion even if the path doesn't exist

  Maintenance Commands:
    --vacuum            Vacuum SpiderOak's local database (rebuilds indexs and
                        reclaims local disk space)
    --destroy-shelved-x
                        destroy each shelved upload transaction already in
                        progress.
    --repair            repair a local SpiderOak installation
    --rebuild-reference-database
                        rebuild the SpiderOak reference database (can take
                        awhile)
    --billing           print a secure web auto-login URL for billing info

  Dangerous/Support Commands:
    Caution: Do not use these commands unless advised by SpiderOak
    support.  They can damage your installation if used improperly.

    --empty-garbage-bin
                        purge all deleted items on the current device
    --apply-subscription-xact
                        apply all transactions previously received from remote
                        devices -- (not intended for general use -- this
                        normally happens automatically)

  Account Commands:
    --bootstrap=ACCOUNT_DEFINITION_FILE
                        Read a json definition file and use contents to create
                        a new account

There is a lot of tutorials and a faq that also answers technical questions.
The best thing: thanks to an education discount (you can ask me for the voucher code if you don't have an .edu email address from your university), I only pay 5 usd/month for 100 GB of backup space!

To install it on a headless webserver, just download the regular installer package for your distribution and launch spideroak one time using x-forwarding:

  1. be sure you've got the "xauth" package installed
  2. ssh -X your.server.com
  3. /usr/bin/SpiderOak
  4. set up what you want to
  5. finally, add this entry to your crontab:

    @reboot /usr/bin/SpiderOak --headless &

This will start SpiderOak on every reboot and backup files according to your wishes.
Some other random facts about SpiderOak:

  • you can select an ftp/sftp where spideroak should store a copy of the encrypted blocks. this can e.g. be a local NAS. This will speed up restore procedures (I don't use it, but nice to have)
  • spideroak allows to share files over the web, including picture galleries (all of that password protected)
  • you can stop uploads and resume them

Comments