Saving Unix one kernel at a time

Bell Labs Unix mantra

In this its 40th year of operating system life, some Unix stalwarts are trying to resurrect its past. That is they are taking on the unenviable and difficult job of restoring to its former glory old Unix software artifacts such as early Unix kernels, compilers and other important historical source code pieces.

In a paper to be presented at next week's Usenix show, Warren Toomey of the Bond School of IT is expected to detail restoration work being done on four key Unix software artifacts all from the early 1970s  - Nsys, 1st edition Unix kernel, 1st and 2nd edition binaries and early C compilers.

In his paper, Toomey states that while the history of Unix has been well-documented, there was a time when the actual artifacts of early Unix development were in danger of being lost forever.   But in recent years groups such as The Unix Heritage Society and its offshoot, the PDP Unix Preservation Society have made great strides in preserving and resurrecting old systems. The Computer History Museum has also helped in keeping old Unix systems humming.

As you might imagine, the restoration of a software artifact to working order brings with it a wealth of difficulties, Toomey states: documentation is missing or incomplete, source code is missing leaving only the binary executables, or conversely the source exists but the compilation tools to reconstruct the executables are missing. The restoration of an operating system to working order presents its own issues, as the system needs a suitable hardware environment in which to run, a suitable file system and a set of system executables to initialize the system and make it useful, Toomey states.

With that as background, a few very small excerpts from Toomey's Unix system restorations paper follows:

The UNIX Archive:

 In 1995 the Unix Heritage Society (TUHS) was founded with a charter to preserve, maintain and restore historical and non-mainstream Unix systems. TUHS has been successful in unearthing artifacts from many important historical Unix systems; this includes system& application source code, system & application executables, user manuals & documentation, and images of populated file systems. The proliferation of Unix variants and the longevity of minicomputer systems such as the VAX and thePDP-11 made TUHS' task of collecting old Unix systems and their documentation relatively straightforward.

For a while, it seemed that the archaeology of Unix stopped somewhere around 1974. The source code and binaries for 5th Edition Unix existed, but not the files for the manuals; conversely, only the 4th Edition Unix manuals existed, but not the source code nor any binaries for the system. At the time, Dennis Ritchie told us that there was very little material from before 4th Edition, just some snippets of code listings. Then, around the mid-90s, Paul Vixie and Keith Bostic "unearthed a DEC tape drive and made it work", and were able to read a number of DEC tapes which had been found "under the floor of the computer room" at Bell Labs. These tapes would turnout to contain a bounty of early Unix artifacts.

The Nsys Kernel 1973:

So far as I can determine, this is the earliest version of Unix that currently exists in machine-readable form. ... What is here is just the source of the OS itself, written in the pre-K&R dialect of C. ... It is intended only for PDP-11/45, and has setup and memory-handling code that will not work on other models).I'm not sure how much work it would take to get this system to boot. Even compiling it might be a bit of a challenge. ...

1st and 2nd Edition Binaries: 1972

Having a set of early Unix executables is nice, but having them execute is much nicer. There were already a number of PDP-11 emulators available to run executables, but there was a significant catch: with no 1st or 2nd  Edition Unix kernel, the executables would run up to their first system call, and then "fall off the edge of the world" and crash. Fortunately, there was a solution.

Early C Compilers: 1972

Aside from their small size, perhaps the most striking thing about these programs is their primitive construction, particularly the many constants strewn throughout. With a lot of handwork, there is probably enough material to construct a working version of the last 1120c compiler, where "works" means "turns source into PDP-11 assembler". But there was a "chicken and egg" problem here: both compilers are in such a primitive dialect of C that no extant working compilers would be able to parse their source code.

1st Edition Unix kernel

The idea of restoring the listing of the 1st Edition kernel to working order seemed impossible: there was no files system on which to store the files, no suitable assembler, no bootstrap code, and no certainty that the user mode binaries on the tapes were compatible with the kernel in the listing; for a while the listing was set aside. Then early in 2008 new enthusiasm for the project was found, and a team of people began the restoration work.

Toomey notes a number of restoration gotchas at the end of his paper but none may be as prescient as this:  "Never underestimate the ‘packrat' nature of computer enthusiasts. Artifacts that appear to be lost are often safely tucked away in a box in someone's basement. The art is to find the individual who has that box."

Layer 8 in a box

Check out these other hot stories:

The Borg lives: BBN gets $30M for artificial intelligence wizard

Court says government background security checks go too far

FTC shuts down notorious Internet Service Provider

Sears gets wrist slap over spyware activities

Cirque du Soleil founder set to blast into space

US shells out $10M for unmanned aircraft that can perch like a bird

Inside the Top 10 hot aerospace technologies

CIA wants more foreign language skills in a big way

Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.
Now read: Getting grounded in IoT