02 February 2009

Amazon S3 and the "Cloud"

I've recently got into Amazon S3 to backup my files off site and at low cost.

If you have no idea what Amazon S3 is, you can read about it here.

Once you've set up your S3 account, you need to be able to connect to it. There are many S3 client products to help you do this. I've tried out the following 3:
  • S3fs for Linux is free and it allows you to simply connect to your S3 "buckets" via the command line.
  • SubCloud is essentially S3fs with more functionality, most importantly encryption, https transfer and rsync functionality. This costs $129 for a one off licence.
  • JungleDisk has most, if not all of the features SubCloud offers, but it comes in Windows, OS X and linux versions with JungleDisk Monitor, a GUI application to help you configure your connection to Amazon S3. Most importantly it only costs $20 per licence.


When you use these S3 clients, your Amazon S3 "buckets", which act like virtual folders as they're probably distributed over many servers, are mounted to you file system like any other drive (or folder depending on your Operating System).

Without doubt, JungleDisk is the cheapest and easiest to set up for Windows / OS X, however, I couldn't work out how to get JungleDisk working from the command line on Linux machines. It kept coming up with a problem with a missing domain in the config file?

SubCloud boasts faster upload speed and it was easier to set up from the Linux command line, but at $129 dollars, I've not seen anything that would tempt me away from JungleDisk on Windows or OS X.

In Summary: Use JungleDisk for Windows / OS X and SubCloud for command line Linux.

The Importance of Encryption
Files transmitted by the connection software are MD5 encyrpted with your Amazon S3 access and secret keys. In addition to that, you need to be using https to make the connection to your buckets.

If you're backing up important and/or sensitive information (which is why you'd want this service in the first place, right?) then you wouldn't want anyone to hack into Amazon or have a dodgy Amazon employee looking at your unencrypted information on their servers. That's why JungleDisk and SubCloud allow you to further encrypt your files using a password that only you (or your organisation) will know. This means the files stored on Amazon's cloud can only be read by people who know that password.

This password encryption / decryption happens transparently when using the JungleDisk or SubCloud client software, so once you've entered that password then the file system acts as normal.

Now, the problem with that is what if you want to change that password? As far as I can tell, the answer is "DON'T"! Which could be a problem you've just fired a disgruntled IT employee. (please feel free to correct me on this)

Using the "Cloud"
Amazon S3 looks cheap - very cheap but there are some things to consider before you dive in.
Firstly, lets deal with a few plus points:
  1. Amazon infrastructure is vast and the SLA's Amazon provides means your data is as safe as it possibly can be
  2. It's incredibly cheap, compared to using other back up / hosting options for most home / small business usage.
  3. File access should also be incredibly fast anywhere in the world, due to the fact Amazon have server farms everywhere
But beware:
  1. It is initially cheap, but over time, you will always be storing and transferring more and more data too and from the service. It could become very expensive very quickly.
  2. Over reliance on this service may suck you in. Cheap prices are good for now, but down the line, you'll be at the mercy of Amazon or whoever holds all your information.
  3. Once you're hooked by the pencil pushers, this will become yet another monthly bill like your cell phone or electricity. In fact when Steve Ballmer says the future of computing is in the Cloud, this is exactly what he sees: a monthly billed reliance on centralised data services that you can't easily get out of!
  4. It also begs the question: what happens to your data if you don't pay up, or just miss a monthly payment?


The Cloud is the future, whether we like it or not. At the moment I like it, but ask me again in a few years when I'm struggling to pay my "data" bills!

19 January 2009

OS X Leopard Virtual Hosts and Symbolic Links

I'll keep this quick.

  • Say you have an SVN working copy in your Documents Folder: Documents/SVN/MyProject/
  • There is a webroot in there: Documents/SVN/MyProject/webroot/
  • You've got the following in etc/hosts : 127.0.0.1 localhost dev
  • You've created a Virtual Host for "dev" in /private/etc/apache2/extra/httpd-vhosts.conf
  • You included FollowSymLinks in the virtual host Directory settings for Documents/SVN/MyProject/webroot
  • You get a "Forbidden" error when you try to view http://dev
  • You did a chmod -R 777 on the Documents/SVN/MyProject/webroot/ folder and you still get "Forbidden"


If this sounds familiar to you, then here's the solution:

Use: chmod a+x

Open a terminal window use the chmod command for each folder in the path to the webroot. In the example path, it would be:
chmod a+x ~/Documents
chmod a+x ~/Documents/SVN
chmod a+x ~/Documents/SVN/MyProject
chmod a+x ~/Documents/SVN/MyProject/webroot

As solutions go, this is quite a bad one as it makes your Documents folder accessible to all. But after hours of trying, that's all I could do to get this to work.

Has anyone got a better way?