Build your own distributed file system
Posted on 10 Nov 2009 at 15:13
Simon Brock marvels at Google's resilient distributed file system, and builds his own from simple, cheap hardware and open-source software
In operation MySQL FS is simple enough to get going – compile the code, set up the tables, change the configuration files and you’re off, and by appropriate use of caches in the database management system you can get very good performance.
The reliance on MySQL replication for file system replication isn’t the impediment one might fear it could become, because the underlying replication mechanism in MySQL is very good. The trouble with MySQL FS is that it employs a complex level of abstraction between itself and the underlying file system, so you end up by implementing a file system on top of a database on top of a file system.
While you should still be able to make that work reasonably efficiently, there’s always going to be a question mark over the number of levels of mapping required between the file system the application sees and the primitive operations that write and read data to and from disk.
The reliance on MySQL replication for file system replication isn’t the impediment one might fear it could become
Hence a different solution, called SeznamFS, which uses a replication style similar to MySQL but for basic file system operations.
In SeznamFS a file system is shown to an application, and whenever that application makes a change to a file or its metadata, the fact that such an operation has occurred is recorded in a log file (I haven’t mentioned metadata yet, but that’s basically the information about each file in the file system, like when it was last accessed, how many applications have it open and so on).
Another node in the file system cluster can then read this log file and apply those same changes to its own files. You can now build a hierarchy of nodes that allows changes made on one system to be reflected to all the others, and also implement circular replication that permits machines to be “at the same level”, so that a change on one machine shows up on others in that cluster.
This result is identical to MySQL replication, and has all the advantages and disadvantages discussed earlier.
Setting up SeznamFS was also very easy – again download, compile and install. Performance was good and there were no concerns about permitting replication over long distances, as between hosting centres separated by a relatively slow communications link.
If your distributed application can live with updates that occur slightly later across a cluster, then SeznamFS works well. To some extent the logging system used by SeznamFS could be a potential source of unreliability, as the client polls for changes and then applies them (which is how replication used to work in MySQL).
In current MySQL systems the client polls for changes and stores them before applying them, so if the master server were to go down before the client had applied all the updates, the latter can still synchronise properly as it already has all the updates stored locally.
The Gluster cluster
While both MySQL FS and SeznamFS are simple systems, GlusterFS is far more complex. In Gluster, a resilient file system is built out of simple “brick” systems that then have more complex file system behaviour layered on top of them, so you start off by creating these bricks that contain underlying file systems where your real files will be stored.
These bricks can be built on virtually any Unix OS and any file system, and since the files themselves are stored as ordinary files in the file system, you can use underlying file system procedures. For example, you could build a brick on a Linux system using Linux Volume Manager (LVM) or Solaris using ZFS, and then use snapshots to create backups.
From around the web
Excellent overview..
Now that's what I call a technical article, well written and not completely over my head :)
This one sent me scurrying to wikipedia for half an hour looking at file systems (which I know zip about). We need more like this guys..
By pinero50 on 18 Nov 2009 ![]()
Agreed, but
I agree, but at the same time, it would be useful to have a follow on article on how to setup GlusterFS for example on Ubuntu.
Especially since Simon seems to be suggesting GlusterFS is the 'winner' out of all the distributes FS's he's written about in this article.
I'd also be interested to know if it would be possible to run this on pc's that are used as desktop computers to provide some kind of cloud backup...
By GAZZAT5 on 22 Nov 2009 ![]()
anthonysjones
Way Hay! An article that doesn't involve some Windows feature or bug, some laptop with a shiny case or a graphics card. Finally something I can tinker with myself and say "hay I built that".
How about some articles on setting up and connecting databases to PHP sites? I'm on chapter one of Oracle 10g.
By anthonysjones on 24 Nov 2009 ![]()
Simon Brock
Simon runs UK-based Wide Area Communications, the company behind websites such as The Spectator. He's a contributing editor to PC Pro and a fervent believer in open-source technologies..
advertisement
- Why virtualisation hasn't slowed the growth of data
- How to make Google AdWords work for your business
- The curse of sloppily written software
- Paying for your crimes with Bitcoin
- Behind the scenes: tech support for Formula 1
- The security risk of fat fingers
- Why Windows Phone 7 isn't quite ready for business
- When will Microsoft stop fiddling with Windows 8?
- Flash down the pan?
- Metro Style apps vs desktop applications
- Chrome's shine getting lost in translation
- BytePac: the cardboard hard disk enclosure
- How tech loosens our grip on reality
- Hokum watch: Safer Internet Day
- Why I'm deleting Adobe from my PC
- Prepare to be patronised: it's Safer Internet Day
- Dear Sony, Samsung and every other tech company in the world: stop trying to be Apple
- Will Apple's Final Cut Pro X update placate the pros?
- Smartr Contacts for iPhone review
- Switching to Office 365's Outlook Web App
- VeriSign slammed for security breach cover-up
- SAP willing to share HANA with Oracle
- Why using a tablet could harm your health
- New RIM boss: no need for drastic change
- RIM founders fall on their swords
- Slow economy helps boost Red Hat revenue by 23%
- Google+ pages get multiple admins
- One in five companies lack card industry compliance
- Oil industry warns hacking attacks could kill
- British workers fear email monitoring
advertisement

