Build your own distributed file system
Posted on 10 Nov 2009 at 15:13
Simon Brock marvels at Google's resilient distributed file system, and builds his own from simple, cheap hardware and open-source software
Whenever our application talks to the resilient file system this duplication could be managed in one of two ways: either it could talk to one node that then passes on any changes to other nodes, or it could talk to all nodes at the same time.
The advantage of talking to a single node is simplicity, but there are a number of disadvantages. For example, if our application is talking to a node that suddenly goes down, how do we know whether the changes we made to that node were ever propagated to the others?
The multinode method avoids such problems, guaranteeing that if a file is on one node then it must be on some others too, and any node that doesn’t have a copy of the file knows where to find it.
Simon Brock's blogs
Read Simon Brock's musings on open-source and much more on the blogsWhile that might tempt you to dismiss the single-node route as flaky, many applications – particularly web servers – contain files that rarely change, which are written only once but read many times.
That means that read resilience is more important than write: even when a file does change, various levels of caching between the user’s web browser and the server may cause it to be some time before that change is reflected to all users.
Some database administrators will already recognise the scenarios just described: MySQL, for example, supports two ways of replicating changes across a collection of database servers. The simpler is to use binary log file replication, where changes to a table on one server are sent to the other databases in the cluster.
The important point is that the application making a change will think it’s succeeded once it has finished on one node, which works well enough but can lead to small replication delays and problems when a server fails.
The more resilient mode is to use MySQL Cluster, but that requires more resources and requires that any change must succeed on all nodes before the application will be told it has completed.
MySQL FS and SeznamFS
If the replication methods employed by a resilient file system are so similar to those employed by a database, why not use a database as the back-end to our file system?
This is the basic idea behind MySQL FS, which uses a MySQL database to hold all the file system information. If we ignore the replication aspect, then at first glance using a database to underpin a file system sounds downright stupid, as after all file systems need to be fast while databases are slow. However, on reflection it may not be as stupid as you think, for several reasons.
First, various operating systems over the years have attempted such a fusion between file system and database. In the dinosaur world of IBM mini-computers System/38 led the way, while more recently Microsoft hinted about combining SQL Server with the file system for Windows Server, although in the end it was never released.
MySQL FS isn’t as sophisticated as either of those systems, but it does work. A modern file system has to implement a very similar storage architecture to that employed by a database – it supports directories organised in a tree-like hierarchy, while a database has tables linked together via relations.
There’s also a similarity between the way an application must walk over this directory structure or access data held in linked tables. In both cases, remembering the results of expensive operations can lead to substantially improved performance – a database caches results from query table indices, while an OS caches results from querying directory structures.
From around the web
Excellent overview..
Now that's what I call a technical article, well written and not completely over my head :)
This one sent me scurrying to wikipedia for half an hour looking at file systems (which I know zip about). We need more like this guys..
By pinero50 on 18 Nov 2009 ![]()
Agreed, but
I agree, but at the same time, it would be useful to have a follow on article on how to setup GlusterFS for example on Ubuntu.
Especially since Simon seems to be suggesting GlusterFS is the 'winner' out of all the distributes FS's he's written about in this article.
I'd also be interested to know if it would be possible to run this on pc's that are used as desktop computers to provide some kind of cloud backup...
By GAZZAT5 on 22 Nov 2009 ![]()
anthonysjones
Way Hay! An article that doesn't involve some Windows feature or bug, some laptop with a shiny case or a graphics card. Finally something I can tinker with myself and say "hay I built that".
How about some articles on setting up and connecting databases to PHP sites? I'm on chapter one of Oracle 10g.
By anthonysjones on 24 Nov 2009 ![]()
Simon Brock
Simon runs UK-based Wide Area Communications, the company behind websites such as The Spectator. He's a contributing editor to PC Pro and a fervent believer in open-source technologies..
advertisement
- Why virtualisation hasn't slowed the growth of data
- How to make Google AdWords work for your business
- The curse of sloppily written software
- Paying for your crimes with Bitcoin
- Behind the scenes: tech support for Formula 1
- The security risk of fat fingers
- Why Windows Phone 7 isn't quite ready for business
- When will Microsoft stop fiddling with Windows 8?
- Flash down the pan?
- Metro Style apps vs desktop applications
- Chrome's shine getting lost in translation
- BytePac: the cardboard hard disk enclosure
- How tech loosens our grip on reality
- Hokum watch: Safer Internet Day
- Why I'm deleting Adobe from my PC
- Prepare to be patronised: it's Safer Internet Day
- Dear Sony, Samsung and every other tech company in the world: stop trying to be Apple
- Will Apple's Final Cut Pro X update placate the pros?
- Smartr Contacts for iPhone review
- Switching to Office 365's Outlook Web App
- VeriSign slammed for security breach cover-up
- SAP willing to share HANA with Oracle
- Why using a tablet could harm your health
- New RIM boss: no need for drastic change
- RIM founders fall on their swords
- Slow economy helps boost Red Hat revenue by 23%
- Google+ pages get multiple admins
- One in five companies lack card industry compliance
- Oil industry warns hacking attacks could kill
- British workers fear email monitoring
advertisement

