Rolling your own Worms

April, 1997



Disk drives that are installed inside a computer system have increased their size at a fantastic rate. For a couple of hundred dollars now, you can get a couple of gigabytes of data in a package only three inches wide. Removable disk drives have not at all kept pace. The drives are expensive, the media is expensive, the capacity is small, and everything is incompatible with everything else, even within the same vendor.

The original disk drives in the computer industry were non removable, and were called drums because there was a head for every track. I am aware of one computer line, the LGP series, that did not have any random access memory as we know it today, but used only drum memory. Memory cycle times on that machine were measured in dozens of milliseconds. While I never programmed one of these, I did use one once, and was quite impressed at just how slow a computer could be. Just think if NT had existed back then...

About the time I got into the computer business in the mid 60's, removable disk drives were becoming commonly available, starting with the six platter IBM 1311 (2MB, 250ms access time) and the single platter 2315 drive that came with the IBM 1130 system (1MB, 100ms access time). These disks were 19 inches in diameter. They were very prone to head crashes.

In the early 70's, IBM came out with the Winchester line of disk drives that were totally sealed. This allowed the manufacturer to be sure that no harmful particulate matter (hairballs, smoke particles) would be found inside the drive, and when that was assured, then the heads could be moved closer to the drive platters, the platters could spin faster, the heads could be made lighter, and things improved at almost chain reaction speed. In the last 20 years, the bit density that can be recorded on magnetic disk drives has increased by three orders of magnitude (to about 1 millions bits per square millimeter).

Leaving the removable drives out in the cold. For maybe 10 years, with the exception of the Iomega drives, which were sort of like floppies, and of course the floppy disks themselves, about all you could find anywhere as main storage was the internal sealed hard drive. Floppies were good enough for the first few years of the PC revolution, but now they are woefully inadequate for many tasks because their size is so small. It is not difficult to generate a full color high definition graphic that will be well over 2MB in size, much larger than will fit on a common floppy.

In the mid 80's a technology was developed that, 10 years later, has the potential to fix all of these problems. And that is the Compact Disc. The CD was developed originally as a replacement for 33rpm audio discs. An interesting factoid was that one of the design requirements imposed by Akio Morita, the Chairman of the Sony Corporation when they were developing this technology, was that Beethoven's 9th symphony must fit on one disc, which meant it had to have 74 minutes of playing time. (Note that these audio guys cannot spell 'disK' properly for some reason.)

Over time (I don't know, maybe from the beginning), the computer guys decided that this media would be great to hold digital information for computers on these things. After all, the audio information itself was recorded digitally on these disks, and bits is bits. And so it comes to this day, when just about every personal computer sold contains a CD_Rom drive, just as all PCs sold 10 years ago contained a floppy drive. We know these things best as containers for multimedia compositions (eg Encarta), databases (eg Delorme's maps) and more commonly, as a medium for program installation. And, they do audio too!

The problem with this is that it is Read Only (hence the name, CD Read Only Memory). This satisfies those needs that I alluded to just above. But the device could be much more useful if only the end user himself could also write to this medium. And, for a price, and a few restrictions, you can.

You can today buy CD Drives that can write, and you can today, even here in River City, buy the Write Once Read Many (WORM) media. They aint cheap, but they are getting less dear. The drives can now be found in the $500 range, and the media (disc with a 'c') costs about $10. There are also eraseable versions starting to be seen, but these are not yet as common, and are much more expensive.

The CD Recordables, or CD_R devices, can record both computer data tracks and audio tracks. These tracks have slightly different characteristics, in that the packet size is 2048 bytes for the former, and 2073 bytes (Not a binary number, you observe!) For the audio tracks. These tracks, up to 99 of them, are grouped together into something called sessions, recalling the audio roots of the technology. There can be several sessions on a single disc. I think that this session concept is what allows the same disc to hold a program for both the Intel and Mac platforms. There is probably some sort of directory on the beginning of the disc that describes the characteristics of the various sessions, and something there differentiates the PC format from the Mac. All this detailed technical stuff is described in a series of standards, commonly labeled Yellow, Orange, and Red Books, that are produced by the industry group that makes these devices. These standards allow for compatibility at different levels, Red being the most strict and mostly devoted to audio.

A CD will hold about 620MB of useful data, with another 40 or so MB of overhead junk, which is big enough to hold even a modern word processing suite and a couple of clip art libraries. The data sessions follow two different formats, something called ISO 9660 for the PC computers, and HFS for the Mac. The ISO standard is rather restrictive as to directory structures (no more than eight levels) and what characters can be in a file name. Unlike the data disK, which generally divides things into cylinders, tracks, and sectors, the data on a disC is a continuous spiral, and the position of data is measured in seconds within a track or session. For this and other reasons, when you write to a CD_R, you must first gather all your data on some local storage media, possibly format it and reorganize it, and then when you start writing the data, continuously write to the CD_R until all the data is complete. That is, you cannot write a file here, and then some time later add another file there, at least on the same track. Thus, to write to a CD_R drive, you need a suite of specialized programs to handle the formatting, organizing, and actual writing of the disc. What I am saying here, is a DOS Copy command will not work. The good news is, your drive manufacturer will generally include some form of these programs with the drive, and more advanced programs are available from third party companies like Corel.

A session is composed of one or more tracks. These sessions can contain different kinds of data, audio on one track, and picture images on another, for instance. A technique named Multisession allows you to virtually add new files to the disc, or delete existing files and directories.

Consider the use of a CD_R for a backup operation. On week 1, you copy your hard drive to a session of your CD_R, taking maybe 25% of the total space. On week 2, you want to copy those files that have changed since the last time. You can do this by generating another session on that disc. Each session generates a Table of Contents, essentially a list of pointers to where the data files are. Each subsequent session generates a new Table of Contents, which can contain the complete directory structure of your local disk, but with pointers to areas of several different sessions. So when it comes time to read the disc, the software is smart enough to read the Last Table of Contents on the disc, and from that find all the files that it needs. Where two or more identically named files exist, only the most recent one will be pointed to. The really good news is, the same program that writes this data can also read the intermediate Tables of Contents, so if you did need to get access to an earlier version of a file, the data is still there and can be retrieved.

And now, to make matters even more interesting, we are starting to see DVD drives, which although it was designed for Video applications, will work also for data storage. It holds about 7GB now, and will eventually hold about 15GB (when it goes double sided). The better news is, the media is dimensionally the same, so the DVD drive can read the older media of the CD_ROM variety. It is likely that these drives will be standard in high end systems this year, and in all systems in a couple of years when the price of a DVD player is about the same as today's CD_ROM player.

We do have a CD_R drive at World Wide Widgets. I decided to put some web page information I am developing onto this media so that I can take it around for show and tell without having to access our intranet. We downloaded the information to the local disk of the computer that had the CD_R drive attached, did all the formatting, inserted a fresh CD_R disc that I bought locally in Spokane, and started the program going. In about a minute we got the horrible error message. We checked the manual, and there are over fifty errors defined for things that can go wrong during your recording session, and we somehow managed to find one of them. Lucky us, there was no definition or explanation of what to do about the cryptic problem. So, as yet, while I have talked at length about the process in this article, I have not yet demonstrated its effectiveness. And at $10 a pop for a trial (the disc was useless after the error was reported), doing a lot of trials is not in my near term budget. My network administrator assures me that the technology does work, and that maybe by the time this article is published, we will actually have generated one.

So, why would we use this device? Backup comes immediately to mind. Our particular device is used with a scanner to put into electronic form rooms of file cabinets of pre computer correspondence, engineering drawings, documentation, and stuff that went into the actual design of the WWW plant facilities several decades ago. My computer jock group is exploring the idea of delivering the software that we write on this media, rather than mag tape. And I still have the idea of putting some of our intranet content onto one of these discs, so that the data can be used in some location that does not have access to our intranet.



Read Next Article -->

Return to Home Page ^