I decided that, while my old rsync/unison backup solution was working fine, I needed something else.
I recently got a netbook and wanted to be able to keep my documents between my laptop and the netbook in sync.
While rsync/unison worked ok, there were some encoding errors in filenames and the "manual" start of the backup always felt kinda strange.
I already had dropbox (ref link) on my laptop and was pretty happy with it. It worked fine but the space is limited and the pro account was too expensive for my taste. I also didn't want to store my files unencrypted in the cloud.
While dropbox offers a headless install for e.g. my v-server, it is pretty annoying that dropbox only has ONE folder that it is able to backup and this folder also always syncs completely.
I then stumbled upon Spideroak (ref link). The SpiderOak team immediately appeald to my nerdy side by having an engineering matters section on their site, complete with crypto details (2048 byte RSA and 256 bit AES, using a key created by the key derivation/strengthening algorithm pdkdf2 (using sha256), with 16384 rounds, and 32 bytes of salt).
The service is completely cross platform (I'm currently running it on Ubuntu 9.10 x86_64, 9.04 x86 server and 32bit Windows 7).
The allow people to select which folders they want to backup and which of those they want to sync with other computers.
They offer a commandline mode which has a whole bunch of options:
# SpiderOak --help Usage: SpiderOak basic command line usage: Options: --version show program's version number and exit -h, --help show this help message and exit -v, --verbose be verbose: show detailed status information as it happens -d NUMBER, --device=NUMBER operate on specified device id (default is the local device) -o DIR, --output=DIR Target dir to restore items to (default is normal download dir) --enable-schedule honor the set activity schedule, even in batchmode (normally the schedule is ignored in batchmode) --disable-schedule disable activity scheduling Operational Modes and Commands: --backup=TARGET ad hoc operation: backup whatever exists at TARGET in the filesystem and exit (ignores existing backup selection.) --restore=item Restore a folder, file, or version. Run "--restore help" for more info --headless run in headless mode (without the graphical interface) --batchmode like headless, but will exit when all available work is done --sync like batchmode, but only backup/update synced folders --scan, --scan-only scan the filesystem for changes and report a summary --build, --scan-and-build-only scan the filesystem, and build all possible file system changes as shelved upload transactions, and exit without uploading them --merge merge and restore the contents of multiple paths from arbitrary devices: dev1:path1 .. devN:pathN --purge=item purge a folder, file (including historical versions) Information Commands: --userinfo, --user-info Show user and device info --space Show space usage information by category and by device --tree Show the hierarchy of stored backup folders --tree-changelog Show a log of how the hierarchy of stored backup folders has changed over time --journal-changelog=folder_or_journal Show the changelog of a given folder --shelved-x, --print-shelved-x Show information about each shelved upload transaction --fulllist Show all folders and files stored on device Backup Selection Manipulation Commands: --selection, --print-selection Show a list of selected and excluded backup items --reset-selection Reset selection (but preserve excluded files) --exclude-file=EXCLUDE_FILE Exclude the given file from the selection --exclude-dir=EXCLUDE_DIR Exclude the given directory from the selection --include-dir=INCLUDE_DIR Include the given directory in the selection --force Do in/exclusion even if the path doesn't exist Maintenance Commands: --vacuum Vacuum SpiderOak's local database (rebuilds indexs and reclaims local disk space) --destroy-shelved-x destroy each shelved upload transaction already in progress. --repair repair a local SpiderOak installation --rebuild-reference-database rebuild the SpiderOak reference database (can take awhile) --billing print a secure web auto-login URL for billing info Dangerous/Support Commands: Caution: Do not use these commands unless advised by SpiderOak support. They can damage your installation if used improperly. --empty-garbage-bin purge all deleted items on the current device --apply-subscription-xact apply all transactions previously received from remote devices -- (not intended for general use -- this normally happens automatically) Account Commands: --bootstrap=ACCOUNT_DEFINITION_FILE Read a json definition file and use contents to create a new account
There is a lot of tutorials and a faq that also answers technical questions.
The best thing: thanks to an education discount (you can ask me for the voucher code if you don't have an .edu email address from your university), I only pay 5 usd/month for 100 GB of backup space!
To install it on a headless webserver, just download the regular installer package for your distribution and launch spideroak one time using x-forwarding:
- be sure you've got the "xauth" package installed
- ssh -X your.server.com
- set up what you want to
- finally, add this entry to your crontab:
@reboot /usr/bin/SpiderOak --headless &
This will start SpiderOak on every reboot and backup files according to your wishes.
Some other random facts about SpiderOak:
- you can select an ftp/sftp where spideroak should store a copy of the encrypted blocks. this can e.g. be a local NAS. This will speed up restore procedures (I don't use it, but nice to have)
- spideroak allows to share files over the web, including picture galleries (all of that password protected)
- you can stop uploads and resume them