SAN on the cheap
Simon Brock and Ian Wrigley use off-the-peg open-source software to build a basic implementation of a SAN using iSCSI
The Storage Area Network, or SAN, is one of those topics on which a lot of people make a lot of noise, but very few really know what it is and why they might want one. One of the main reasons for this is that once you sit down and look at the cost of what you need to build a SAN, things start to look really expensive. So we're going to take a look at what a SAN is, why you'd want one and, of course, how Open Source can help you build one on the cheap.
Depending on your IT background, a SAN is either a great new idea or a great old idea, but let's skip the history lesson and pretend SAN is new. In a conventional PC, everything is installed in one box: the processor, the memory and the disk drives. As you add more disk drives, it gets harder to fit them into the same box, so they end up being pushed out into an external box. This box of disks may have some abilities of its own; for example, it may implement some form of resilience based on striping, mirroring (or both) via some form of RAID controller. What becomes important is how this box of disks is attached to the box containing the processor onboard.
SCSI is one technology for doing this, and it works well over short distances but has problems with long cables. Fibre Channel was invented to remedy this: put simply, it's a way of sending SCSI commands over optic fibre rather than electrical cable. Fibre Channel lets you to do some other neat things too, such as switching - similar to Ethernet switching - to get many PCs talking to the same box of disks, or a single PC talking to many boxes of disks. Moreover, these machines can have alternate paths to get to their data on the boxes of disks. That, basically, is a SAN: a collection of machines connected to boxes of disks via a fibre-optic network. It runs very quickly because of the fibre - when first introduced, Fibre Channel ran at 1Gb/sec and it will soon be available at 10Gb/sec.
At first glance, it may not be obvious why a SAN is such a good idea. If you have a lot of data, and need it accessed in a lot of different places (with the possibility for redundancy), a SAN topology may be a way to do this. It also lets you invest your storage budget centrally rather than distributing it around a lot of machines.
However, some manufacturers recently figured out that 1Gb/sec could be provided more cheaply than via an expensive fibre infrastructure - a 1000BaseT Ethernet runs at 1Gb/sec over cheap copper cable. But to make a SAN work over 1000BaseT, they had to find a way of sending SCSI commands over Ethernet. Enter iSCSI, which is all about sending SCSI commands in IP packets. What this means is that if you have a machine and box of disks that support iSCSI, you can build a SAN over your existing Ethernet. That lets you use all the neat features that Ethernet and IP provide for sorting out routes and alternate paths, without worrying about how a Fibre Channel switch would do it. The iSCSI SAN was born and is now supported by a large number of manufacturers. In the rest of this article, we're going to show you how to put together a reliable and resilient iSCSI SAN, based (for the most part) on open-source software.
Initiators and targets
Before going on, we need to know a bit of iSCSI terminology. Under iSCSI, there are two parties: the initiator (the machine that wants the data) and the target (the machine providing the data). In a fibre-based SAN, the initiator is a computer and the target is a RAID array, but in our example we're going to build an iSCSI target out of two PCs in an active-passive pair. The idea will be that one machine will act as target and handle iSCSI requests from the initiators, then mirror its disks to another machine that will act as a hot standby. If the active machine fails, the standby machine will take over to provide the iSCSI service, and when the failed machine returns it will resynchronise with the standby machine and then take back the service. This is ambitious stuff, and as we write this article we don't know if it's going to work; it should, but we'll see.