Skip to navigation
Real World Computing
Chrome

Build your own distributed file system

Posted on 10 Nov 2009 at 15:13

Simon Brock marvels at Google's resilient distributed file system, and builds his own from simple, cheap hardware and open-source software

One of the crucial components that makes Google so effective is its distributed file system, which underlies all the apps such as Mail, Documents and the Picasa photo service.

The Google File System divides user files into packets and distributes them across a collection of servers, which are themselves distributed across many hosting centres.

Many people would like such a reliable file system for their own apps, but Google doesn’t currently allow direct API access (although Amazon’s S3 service is quite similar).

Until recently, most reliable file systems centralised the data onto storage nodes built from expensive dedicated hardware – for example storage area networks with duplicated controllers, power supplies and disk drives, using fibre channel interconnect running at gigabit speeds – whereas Google’s setup employs cheap, commodity hardware as building blocks.

We can now look to distribute storage across very cheap servers, which probably only have one processor socket but a terabyte of disk space

Recent advances in technology are making such dedicated nodes feel obsolete. Where you might have made up a 1TB resilient node from a dozen 250GB disks (five drives as a RAID5 plus one hot spare, duplicated into a RAID10 mirror), this year 1TB single drives are on sale and bigger ones will soon be available.

We can now look to distribute storage across very cheap servers, which probably only have one processor socket but a terabyte of disk space.

Most modern applications don’t actually use a great deal of storage: we may have multi-terabyte disks, but rarely use all that space. For example, I started writing these 3,000-word columns a few years ago and have used three new versions of Word in that time, but the document size has barely increased at all – a far slower rate than drive capacity.

So here’s our new model for building resilience – do a Google using simple, cheap hardware and open-source software to bind it into a file system that can be used by existing applications. We don’t want to have to rewrite our applications to use this file system, as is the case with some massively parallel file systems.

Light the FUSE

FUSE (File system in USE space) is an open-source package for creating new file systems, which is happiest on Linux but can be run under Mac OS X and, in some cases, even Windows.

File systems are normally created and managed via the OS in “kernel” or “privileged” memory space, but while FUSE itself employs a module that runs under control of the OS kernel, it enables ordinary user applications to create file systems that can be used by other applications.

We’ve mentioned FUSE before in this column as the basis of various useful file system extensions: for example, for Unix systems there are FUSE-based utilities to enable encryption (useful for safely managing USB sticks); access to archive files (treat a zip as a set of directories); and foreign file systems such as NTFS images inside Unix.

By implementing a particular interface to the OS kernel, FUSE enables applications to appear and behave as file systems, while monitoring what those applications do to the file system.

This last aspect is going to be very important to our implementation, as it will be the key to distributing the file data. Say our application is a mail server that’s accessing a collection of files on our resilient file system – to ensure resilience we need to make sure each file is duplicated at least once, and that happens onto a separate machine.

1 2 3 4
Subscribe to PC Pro magazine. We'll give you 3 issues for £1 plus a free gift - click here

From around the web

User comments

Excellent overview..

Now that's what I call a technical article, well written and not completely over my head :)
This one sent me scurrying to wikipedia for half an hour looking at file systems (which I know zip about). We need more like this guys..

By pinero50 on 18 Nov 2009

Agreed, but

I agree, but at the same time, it would be useful to have a follow on article on how to setup GlusterFS for example on Ubuntu.

Especially since Simon seems to be suggesting GlusterFS is the 'winner' out of all the distributes FS's he's written about in this article.

I'd also be interested to know if it would be possible to run this on pc's that are used as desktop computers to provide some kind of cloud backup...

By GAZZAT5 on 22 Nov 2009

anthonysjones

Way Hay! An article that doesn't involve some Windows feature or bug, some laptop with a shiny case or a graphics card. Finally something I can tinker with myself and say "hay I built that".

How about some articles on setting up and connecting databases to PHP sites? I'm on chapter one of Oracle 10g.

By anthonysjones on 24 Nov 2009

Leave a comment

You need to Login or Register to comment.

(optional)

Simon Brock

Simon Brock

Simon runs UK-based Wide Area Communications, the company behind websites such as The Spectator. He's a contributing editor to PC Pro and a fervent believer in open-source technologies..

Read more More by Simon Brock

advertisement

Latest Real World Computing
Latest Blog Posts Subscribe to our RSS Feeds
Latest News Stories Subscribe to our RSS Feeds
Latest ReviewsSubscribe to our RSS Feeds

advertisement

Sponsored Links
 
SEARCH
SIGN UP

Your email:

Your password:

remember me

advertisement


Hitwise Top 10 Website 2010
 
 

PCPro-Computing in the Real World Printed from www.pcpro.co.uk

Register to receive our regular email newsletter at http://www.pcpro.co.uk/registration.

The newsletter contains links to our latest PC news, product reviews, features and how-to guides, plus special offers and competitions.