Build your own distributed file system
Posted on 10 Nov 2009 at 15:13
Simon Brock marvels at Google's resilient distributed file system, and builds his own from simple, cheap hardware and open-source software
When you use Gluster for replication you can have any number of nodes in your file system, and all file operations become “atomic” across that file system – in other words, if your application changes a file via one node then all other nodes see that change immediately.
What’s very clever about the implementation of this in Gluster is that there’s no requirement for a node to be running to see the changes: if a node has been down for a while, when it rejoins the cluster it will heal its own version of the file system.
As an application tries to access a file the cluster will decide which files it should see, and if a particular brick doesn’t have the correct files they will be sent from other nodes.
This is particularly important because each individual node can act as both a client and a server/brick. An application will always try to read files from its local copy first, rather than going out to the other nodes in the network.
The GlusterFS architecture is layered, allowing different translators to be inserted at different points
The GlusterFS architecture is layered, allowing different translators to be inserted at different points – so, for example, caches can be introduced on both the client and server to speed up accesses.
A striping translator can be introduced to stripe files across a cluster of bricks, which can be useful for building high-performance clusters, and similarly it’s possible to deploy different network fabrics to improve performance: GlusterFS has native support for 10Gb Ethernet but can fall back to slower Gigabit.
Gluster is simple to install too, and can either be compiled from source or installed from packages. We’ve set up a number of different GlusterFS systems now and have found them to be very reliable once configured.
We did have some problems, but these were mostly due to us experimenting with our configuration. We’d recommend that you decide before you start what you want to achieve and then configure for that specific setup, rather than setting up one configuration and then trying to change it to suit. For example, adding more nodes to a working GlusterFS proved unreliable for us: better to start with as many as you need.
Gluster is certainly the most powerful of the products we looked at, but by using any one of them it’s now possible to build a reliable clustered file system for your applications for just a few hundred pounds, which would have cost many thousands in hardware just a few years ago.
From around the web
Excellent overview..
Now that's what I call a technical article, well written and not completely over my head :)
This one sent me scurrying to wikipedia for half an hour looking at file systems (which I know zip about). We need more like this guys..
By pinero50 on 18 Nov 2009 ![]()
Agreed, but
I agree, but at the same time, it would be useful to have a follow on article on how to setup GlusterFS for example on Ubuntu.
Especially since Simon seems to be suggesting GlusterFS is the 'winner' out of all the distributes FS's he's written about in this article.
I'd also be interested to know if it would be possible to run this on pc's that are used as desktop computers to provide some kind of cloud backup...
By GAZZAT5 on 22 Nov 2009 ![]()
anthonysjones
Way Hay! An article that doesn't involve some Windows feature or bug, some laptop with a shiny case or a graphics card. Finally something I can tinker with myself and say "hay I built that".
How about some articles on setting up and connecting databases to PHP sites? I'm on chapter one of Oracle 10g.
By anthonysjones on 24 Nov 2009 ![]()
Simon Brock
Simon runs UK-based Wide Area Communications, the company behind websites such as The Spectator. He's a contributing editor to PC Pro and a fervent believer in open-source technologies..
advertisement
- Why virtualisation hasn't slowed the growth of data
- How to make Google AdWords work for your business
- The curse of sloppily written software
- Paying for your crimes with Bitcoin
- Behind the scenes: tech support for Formula 1
- The security risk of fat fingers
- Why Windows Phone 7 isn't quite ready for business
- When will Microsoft stop fiddling with Windows 8?
- Flash down the pan?
- Metro Style apps vs desktop applications
- Chrome's shine getting lost in translation
- BytePac: the cardboard hard disk enclosure
- How tech loosens our grip on reality
- Hokum watch: Safer Internet Day
- Why I'm deleting Adobe from my PC
- Prepare to be patronised: it's Safer Internet Day
- Dear Sony, Samsung and every other tech company in the world: stop trying to be Apple
- Will Apple's Final Cut Pro X update placate the pros?
- Smartr Contacts for iPhone review
- Switching to Office 365's Outlook Web App
- VeriSign slammed for security breach cover-up
- SAP willing to share HANA with Oracle
- Why using a tablet could harm your health
- New RIM boss: no need for drastic change
- RIM founders fall on their swords
- Slow economy helps boost Red Hat revenue by 23%
- Google+ pages get multiple admins
- One in five companies lack card industry compliance
- Oil industry warns hacking attacks could kill
- British workers fear email monitoring
advertisement

