Build your own distributed file system
Posted on 10 Nov 2009 at 15:13
Simon Brock marvels at Google's resilient distributed file system, and builds his own from simple, cheap hardware and open-source software
One of the crucial components that makes Google so effective is its distributed file system, which underlies all the apps such as Mail, Documents and the Picasa photo service.
The Google File System divides user files into packets and distributes them across a collection of servers, which are themselves distributed across many hosting centres.
Many people would like such a reliable file system for their own apps, but Google doesn’t currently allow direct API access (although Amazon’s S3 service is quite similar).
Until recently, most reliable file systems centralised the data onto storage nodes built from expensive dedicated hardware – for example storage area networks with duplicated controllers, power supplies and disk drives, using fibre channel interconnect running at gigabit speeds – whereas Google’s setup employs cheap, commodity hardware as building blocks.
We can now look to distribute storage across very cheap servers, which probably only have one processor socket but a terabyte of disk space
Recent advances in technology are making such dedicated nodes feel obsolete. Where you might have made up a 1TB resilient node from a dozen 250GB disks (five drives as a RAID5 plus one hot spare, duplicated into a RAID10 mirror), this year 1TB single drives are on sale and bigger ones will soon be available.
We can now look to distribute storage across very cheap servers, which probably only have one processor socket but a terabyte of disk space.
Most modern applications don’t actually use a great deal of storage: we may have multi-terabyte disks, but rarely use all that space. For example, I started writing these 3,000-word columns a few years ago and have used three new versions of Word in that time, but the document size has barely increased at all – a far slower rate than drive capacity.
So here’s our new model for building resilience – do a Google using simple, cheap hardware and open-source software to bind it into a file system that can be used by existing applications. We don’t want to have to rewrite our applications to use this file system, as is the case with some massively parallel file systems.
Light the FUSE
FUSE (File system in USE space) is an open-source package for creating new file systems, which is happiest on Linux but can be run under Mac OS X and, in some cases, even Windows.
File systems are normally created and managed via the OS in “kernel” or “privileged” memory space, but while FUSE itself employs a module that runs under control of the OS kernel, it enables ordinary user applications to create file systems that can be used by other applications.
We’ve mentioned FUSE before in this column as the basis of various useful file system extensions: for example, for Unix systems there are FUSE-based utilities to enable encryption (useful for safely managing USB sticks); access to archive files (treat a zip as a set of directories); and foreign file systems such as NTFS images inside Unix.
By implementing a particular interface to the OS kernel, FUSE enables applications to appear and behave as file systems, while monitoring what those applications do to the file system.
This last aspect is going to be very important to our implementation, as it will be the key to distributing the file data. Say our application is a mail server that’s accessing a collection of files on our resilient file system – to ensure resilience we need to make sure each file is duplicated at least once, and that happens onto a separate machine.
From around the web
Excellent overview..
Now that's what I call a technical article, well written and not completely over my head :)
This one sent me scurrying to wikipedia for half an hour looking at file systems (which I know zip about). We need more like this guys..
By pinero50 on 18 Nov 2009 ![]()
Agreed, but
I agree, but at the same time, it would be useful to have a follow on article on how to setup GlusterFS for example on Ubuntu.
Especially since Simon seems to be suggesting GlusterFS is the 'winner' out of all the distributes FS's he's written about in this article.
I'd also be interested to know if it would be possible to run this on pc's that are used as desktop computers to provide some kind of cloud backup...
By GAZZAT5 on 22 Nov 2009 ![]()
anthonysjones
Way Hay! An article that doesn't involve some Windows feature or bug, some laptop with a shiny case or a graphics card. Finally something I can tinker with myself and say "hay I built that".
How about some articles on setting up and connecting databases to PHP sites? I'm on chapter one of Oracle 10g.
By anthonysjones on 24 Nov 2009 ![]()
Simon Brock
Simon runs UK-based Wide Area Communications, the company behind websites such as The Spectator. He's a contributing editor to PC Pro and a fervent believer in open-source technologies..
advertisement
- Why virtualisation hasn't slowed the growth of data
- How to make Google AdWords work for your business
- The curse of sloppily written software
- Paying for your crimes with Bitcoin
- Behind the scenes: tech support for Formula 1
- The security risk of fat fingers
- Why Windows Phone 7 isn't quite ready for business
- When will Microsoft stop fiddling with Windows 8?
- Flash down the pan?
- Metro Style apps vs desktop applications
- Chrome's shine getting lost in translation
- BytePac: the cardboard hard disk enclosure
- How tech loosens our grip on reality
- Hokum watch: Safer Internet Day
- Why I'm deleting Adobe from my PC
- Prepare to be patronised: it's Safer Internet Day
- Dear Sony, Samsung and every other tech company in the world: stop trying to be Apple
- Will Apple's Final Cut Pro X update placate the pros?
- Smartr Contacts for iPhone review
- Switching to Office 365's Outlook Web App
- VeriSign slammed for security breach cover-up
- SAP willing to share HANA with Oracle
- Why using a tablet could harm your health
- New RIM boss: no need for drastic change
- RIM founders fall on their swords
- Slow economy helps boost Red Hat revenue by 23%
- Google+ pages get multiple admins
- One in five companies lack card industry compliance
- Oil industry warns hacking attacks could kill
- British workers fear email monitoring
advertisement

