Raiding your Hardware

February 1997

We are all aware of how the cost of computers keep falling and falling. A major part of all that is the cost of the memory in our computers.

Memory comes in two forms, the random access memory attached to the processor, and the secondary memory (in the form of disk drives, generally) that are a peripheral to the processor. Both forms of memory have gotten larger over the years, and in the memory business, making something larger reduces your cost per byte, since it costs roughly the same to make a chip, or a drive, almost regardless of the size of that chip or drive. At any rate, the costs do not go up anything like linearly.

The first computer I worked on professionally was an IBM 1800, which maxed out at 16K of memory, and contained four 4K core planes. One of these core planes failed, and I was told that the cost of a replacement plane was $25,000. This was in about 1968. Core memory was quite expensive to make, because the memory consisted of little iron doughnuts that were hand strung into these core planes. My first boss had worked for Control Data Corporation, and even then they were sending these units overseas where labor was much cheaper for the final fabrication step. CDC referred to their Asian memory factory as the Hong Kong Core House.

Core memories were slow by today's standards, and very expensive, but the good news was they could retain their memory even when the power was turned off. A fellow I know at Union Carbide told me that one of their chemical plants burned down, including the process control computer room. But they were able to salvage the core planes from the rubble, and even after a fire and water and everything else, still read out the memory to see what the computers were doing in the last few seconds before the plant exploded. Actually, they did it to see if they could get some information to cover their cabooses when the blame for the incident started being handed out.

Disk drives have been around in one form or another since sometime in the mid 50's. The early mainframe disk drives were very impressive to behold. When I first joined World Wide Widgets, one of the mainframe computers we used then was an IBM 7010, and it had a disk drive that stood about eight feet high, and had a glass front so you could see the drives spinning and the arms going in and out. When those arms moved, the drive just about danced around the raised floor, and the whole building shook. This thing probably held all of a few megabytes.

There have been several different disk drive technologies developed over the years, from fixed head drums, to cartridge disks, to removable pack drives, to finally the technology that we all more or less use today, the Winchester technology developed by IBM in the mid 70's. This technology consists of several platters, and the arms and actuators and everything all contained inside the hermetically sealed drive. When this first came out, everybody at first thought it was just another restraint of trade gimmick, because while there were disk pack manufacturers, and disk drive manufacturers, only IBM at that time did both, and the Winchester drive contains both inside the device. However, because of the sealing, and the advance of clean rooms, the manufacturers were able to spin the drive faster (now up to 10,000 rpm), and put the flying heads closer to the surface, without worrying about head crashes from dust and grime that always plagued the disk pack technology. The closer the head could be brought to the surface, the higher the bit density that the drive could contain, and therefore the more the memory that you could squeeze in for about the same price as whatever was around the day before. Eventually other manufacturers developed the technology, and today companies like Conner and Maxstor have a significant piece of the disk pie.

In 1980, a 20MB drive stood a foot high and 19 inches wide, and cost about $1000/MB. Now in the mid 90's, a 1GB drive stands two inches high, is 3.5 inches wide, and costs $0.10/MB. I would suspect that you can find mainframe drives that hold two digit Gigabytes of storage per spindle, but I also suspect that they would be physically large, and expensive, and either slow or full of high tech things like multiple actuators and huge cash memories, all of which makes these things even more expensive.

In 1988, a few grad students from Cal Bizerkly published a paper titled "A Case for Redundant Arrays of Inexpensive Disks", or Raids. Their seminal thought was that it would be a lot cheaper to tie together a whole bunch of small PC-like disks, than to continue developing mainframe size disk drives of the same storage capability. And, you could make it more reliable, and possibly faster.

What, they asked, is the significant difference between a cheap PC disk and a high performance, high capacity mainframe drive? One is larger cylinder size, the number of bytes that you can access without moving the head. The second is improved reliability, which costs lots of dinero. And finally, generally increased speed because of higher rotation speeds, and more sophisticated controllers (channels in IBM terminology).

The Raid specification enumerates six different types of disk configurations, and the industry has come up with at least one more since then. Three of the six original configurations were never really developed. Each configuration has a number, from 0 to 10. A Raid system will give you impressive reliability, and generally adequate performance, and has the potential of disk sizes large enough for anything but enterprise wide situations.

Raid 0 gangs together N disk drives (like four or five, maybe more) in such a manner that each track on each drive is considered an extension of the same track on the previous drive. That is, a single disk will write all the sectors of track 0, then track 1, and so on. With Raid 0 and four disks, you write track 0 disk 0, then track 0 disk 1, t0 d2, and t0 d3, before moving to track 1 disk 0. Thus, the head only moves a fourth as much, and with a somewhat more sophisticated controller with full raid track buffers, all four disk drives can be written to simultaneously. Studies have shown that it takes only 10% more time to write to all the drives than to a single large one.

The downside of this is reliability. You are using cheap drives here, and even though the Mean Time Between Failure is quite high, say 100,000 hours, it is nothing compared to a mainframe drive. And if you have N drives in your Raid, the MTBF is 1/N of any one disk, or only 25000 hours in this case. More drives in your Raid results in fewer hours MTBF.

Raid 1 is a mirrored backup. Each disk is mirrored by its controller to another identical disk. This does cost you in that you need twice as many drives, but then again, they are all cheap. The technology today is such that if a drive fails, the system continues on with the non failed drive, and if you have a hot swap drive capability, you simply pull out the failed drive, plug in a spare one, and the system automatically resynchronizes the new disk, and then puts the mirrored system back fully online.

Raid 10 is rather like Raid 1, except rather than mirror individual disks, you have two full Raid 0 configurations, each mirroring the other. This way you get the performance of Raid 0, with a much higher reliability.

Raid 2,3 and 4 turned out to be academic exercises. Raid 5 is the most exciting. This configuration requires one extra disk, so if you started with a 4 drive Raid 0 system, under Raid 5 you would require five drives. One of the disks is a parity disk, and is built such that if any disk fails, all the data can be recomputed from the remaining disks plus the parity disk. The first semiconductor memory came out this way, using error correction bits. This just extends the concept to a disk. Actually, Raid 5 is slightly more complicated than my above statement, because in fact you have parity platters which are spread across all the available drives, but I think you get the general idea.

Raid 5 gives you the advantages of Raid 0, at least for reading, with the reliability of Raid 1, but without the need for all the extra disk drives. However, Raid 5 slows down considerably when you are writing, and really slows down if you are running with one busted drive. It can also get quite expensive, because the controller for a Raid 5 configuration is quite expensive, compared to the simple controllers needed by Raid 0 and 1 systems.

So, who cares? Probably not the home computer user. If that user is careful about backing up his system (Ha!) To tape, when his system fails in 100,000 hours (11 years) he can just go buy a new drive (actually, by then, probably a whole new system) and restore his drive from his backup tapes (Har! Har!). But the industrial user really needs full 24 X 7 system availability, and cannot have his factory wait around idle for three hours while a backup tape is reloading 8GB of process data. Consider the effect on an airline reservation system that crashes for several hours. The cost can be measured in seven and eight figures.

At WWW, we have been installing our process control systems with dual everything: processors, I/O devices, disks, everything. We will probably continue to do that, but we are starting to buy Raid 5 systems, and already we have seen its benefit. One multiuser Unix system that was recently installed with a three drive Raid 5 system did have a drive go bad after only a few weeks of use. The vendor did not have a spare drive in Spokane, the weekend was coming up, and asked if he could come out the next Tuesday. The system performed quite normally for the several days before we got our spare drive delivered and installed, even though one drive was totally down.

Available now is the concept of a spinning spare. It is a drive with no connection. When an operating drive does fail, the system itself detaches the bad drive, attaches the hot spare, resynchronizes it, and you are back running probably not even knowing that a problem occurred, except for the E-mail that the Operating System will send you to tell you to replace the busted drive.



Read Next Article -->

Return to Home Page ^