We are all aware of how the cost of computers keep falling and falling. A major part of all that is
the cost of the memory in our computers.
Memory comes in two forms, the random access memory attached to the processor, and the
secondary memory (in the form of disk drives, generally) that are a peripheral to the processor.
Both forms of memory have gotten larger over the years, and in the memory business, making
something larger reduces your cost per byte, since it costs roughly the same to make a chip, or a
drive, almost regardless of the size of that chip or drive. At any rate, the costs do not go up
anything like linearly.
The first computer I worked on professionally was an IBM 1800, which maxed out at 16K of
memory, and contained four 4K core planes. One of these core planes failed, and I was told that
the cost of a replacement plane was $25,000. This was in about 1968. Core memory was quite
expensive to make, because the memory consisted of little iron doughnuts that were hand strung
into these core planes. My first boss had worked for Control Data Corporation, and even then
they were sending these units overseas where labor was much cheaper for the final fabrication
step. CDC referred to their Asian memory factory as the Hong Kong Core House.
Core memories were slow by today's standards, and very expensive, but the good news was they
could retain their memory even when the power was turned off. A fellow I know at Union
Carbide told me that one of their chemical plants burned down, including the process control
computer room. But they were able to salvage the core planes from the rubble, and even after a
fire and water and everything else, still read out the memory to see what the computers were
doing in the last few seconds before the plant exploded. Actually, they did it to see if they could
get some information to cover their cabooses when the blame for the incident started being
handed out.
Disk drives have been around in one form or another since sometime in the mid 50's. The early
mainframe disk drives were very impressive to behold. When I first joined World Wide
Widgets, one of the mainframe computers we used then was an IBM 7010, and it had a disk drive
that stood about eight feet high, and had a glass front so you could see the drives spinning and
the arms going in and out. When those arms moved, the drive just about danced around the
raised floor, and the whole building shook. This thing probably held all of a few megabytes.
There have been several different disk drive technologies developed over the years, from fixed
head drums, to cartridge disks, to removable pack drives, to finally the technology that we all
more or less use today, the Winchester technology developed by IBM in the mid 70's. This
technology consists of several platters, and the arms and actuators and everything all contained
inside the hermetically sealed drive. When this first came out, everybody at first thought it was
just another restraint of trade gimmick, because while there were disk pack manufacturers, and
disk drive manufacturers, only IBM at that time did both, and the Winchester drive contains both
inside the device. However, because of the sealing, and the advance of clean rooms, the
manufacturers were able to spin the drive faster (now up to 10,000 rpm), and put the flying heads
closer to the surface, without worrying about head crashes from dust and grime that always
plagued the disk pack technology. The closer the head could be brought to the surface, the
higher the bit density that the drive could contain, and therefore the more the memory that you
could squeeze in for about the same price as whatever was around the day before. Eventually
other manufacturers developed the technology, and today companies like Conner and Maxstor
have a significant piece of the disk pie.
In 1980, a 20MB drive stood a foot high and 19 inches wide, and cost about $1000/MB. Now in
the mid 90's, a 1GB drive stands two inches high, is 3.5 inches wide, and costs $0.10/MB. I
would suspect that you can find mainframe drives that hold two digit Gigabytes of storage per
spindle, but I also suspect that they would be physically large, and expensive, and either slow or
full of high tech things like multiple actuators and huge cash memories, all of which makes these
things even more expensive.
In 1988, a few grad students from Cal Bizerkly published a paper titled "A Case for Redundant
Arrays of Inexpensive Disks", or Raids. Their seminal thought was that it would be a lot cheaper
to tie together a whole bunch of small PC-like disks, than to continue developing mainframe size
disk drives of the same storage capability. And, you could make it more reliable, and possibly
faster.
What, they asked, is the significant difference between a cheap PC disk and a high performance,
high capacity mainframe drive? One is larger cylinder size, the number of bytes that you can
access without moving the head. The second is improved reliability, which costs lots of dinero.
And finally, generally increased speed because of higher rotation speeds, and more sophisticated
controllers (channels in IBM terminology).
The Raid specification enumerates six different types of disk configurations, and the industry has
come up with at least one more since then. Three of the six original configurations were never
really developed. Each configuration has a number, from 0 to 10. A Raid system will give you
impressive reliability, and generally adequate performance, and has the potential of disk sizes
large enough for anything but enterprise wide situations.
Raid 0 gangs together N disk drives (like four or five, maybe more) in such a manner that each
track on each drive is considered an extension of the same track on the previous drive. That is, a
single disk will write all the sectors of track 0, then track 1, and so on. With Raid 0 and four
disks, you write track 0 disk 0, then track 0 disk 1, t0 d2, and t0 d3, before moving to track 1 disk
0. Thus, the head only moves a fourth as much, and with a somewhat more sophisticated
controller with full raid track buffers, all four disk drives can be written to simultaneously.
Studies have shown that it takes only 10% more time to write to all the drives than to a single
large one.
The downside of this is reliability. You are using cheap drives here, and even though the Mean
Time Between Failure is quite high, say 100,000 hours, it is nothing compared to a mainframe
drive. And if you have N drives in your Raid, the MTBF is 1/N of any one disk, or only 25000
hours in this case. More drives in your Raid results in fewer hours MTBF.
Raid 1 is a mirrored backup. Each disk is mirrored by its controller to another identical disk.
This does cost you in that you need twice as many drives, but then again, they are all cheap. The
technology today is such that if a drive fails, the system continues on with the non failed drive,
and if you have a hot swap drive capability, you simply pull out the failed drive, plug in a spare
one, and the system automatically resynchronizes the new disk, and then puts the mirrored
system back fully online.
Raid 10 is rather like Raid 1, except rather than mirror individual disks, you have two full Raid 0
configurations, each mirroring the other. This way you get the performance of Raid 0, with a
much higher reliability.
Raid 2,3 and 4 turned out to be academic exercises. Raid 5 is the most exciting. This
configuration requires one extra disk, so if you started with a 4 drive Raid 0 system, under Raid 5
you would require five drives. One of the disks is a parity disk, and is built such that if any disk
fails, all the data can be recomputed from the remaining disks plus the parity disk. The first
semiconductor memory came out this way, using error correction bits. This just extends the
concept to a disk. Actually, Raid 5 is slightly more complicated than my above statement,
because in fact you have parity platters which are spread across all the available drives, but I
think you get the general idea.
Raid 5 gives you the advantages of Raid 0, at least for reading, with the reliability of Raid 1, but
without the need for all the extra disk drives. However, Raid 5 slows down considerably when
you are writing, and really slows down if you are running with one busted drive. It can also get
quite expensive, because the controller for a Raid 5 configuration is quite expensive, compared to
the simple controllers needed by Raid 0 and 1 systems.
So, who cares? Probably not the home computer user. If that user is careful about backing up
his system (Ha!) To tape, when his system fails in 100,000 hours (11 years) he can just go buy a
new drive (actually, by then, probably a whole new system) and restore his drive from his backup
tapes (Har! Har!). But the industrial user really needs full 24 X 7 system availability, and cannot
have his factory wait around idle for three hours while a backup tape is reloading 8GB of process
data. Consider the effect on an airline reservation system that crashes for several hours. The cost
can be measured in seven and eight figures.
At WWW, we have been installing our process control systems with dual everything: processors,
I/O devices, disks, everything. We will probably continue to do that, but we are starting to buy
Raid 5 systems, and already we have seen its benefit. One multiuser Unix system that was
recently installed with a three drive Raid 5 system did have a drive go bad after only a few weeks
of use. The vendor did not have a spare drive in Spokane, the weekend was coming up, and
asked if he could come out the next Tuesday. The system performed quite normally for the
several days before we got our spare drive delivered and installed, even though one drive was
totally down.
Available now is the concept of a spinning spare. It is a drive with no connection. When an
operating drive does fail, the system itself detaches the bad drive, attaches the hot spare,
resynchronizes it, and you are back running probably not even knowing that a problem occurred,
except for the E-mail that the Operating System will send you to tell you to replace the busted
drive.
Read Next Article --> Return to Home Page ^