I’m going to be expanding on something I’ve talked about before. This idea of Unix’s supposed simplicity and how Unix has deviated over the years rather fascinates me.
Some years ago I remember reading the The Art of Unix Programming by the vociferous Eric Raymond. I remember this book making a strong impact on how a I thought about system design and the writing of new programs. TAoUP is not, by itself, a revolutionary book. Rather, it is a collection of received wisdom regarding the design of the Unix operating system and of programs intended to be run in the Unix environment. I think that the most important idea put forward in the book is the notion of Unix, rather than simply being a platform on which to run large complicated programs, is rather a collection of smaller programs, unified by a few metaphors. Specifically, the notion that ‘everything is a file’ and the pipe metaphor which is built in top of that are the glue which holds Unix together. Unix is a collection of small programs, each of which does one thing well, and these programs can be unified together, using a combination of pipes and shell scripting, to create far more complicated and functional systems. Having small, simple programs, makes them easier to debug and get right, while being able to compose them give application developers a toolkit denied to developers on other systems.
This mode of development is a fundamentally sound idea, I think. There are a few drawbacks, such as the incidental complexity of having so many little tools to work with, as well as the fact that each one used become yet another dependency to manage in your application.1 Generally, though, composing larger programs out of simpler parts is a fundamental principle of software development. Arguable, the majority of advancement in programming language design has been in finding new and better ways of doing just this. First we introduced the subroutine, then structured programming had the procedure, OOP came up with the object, and now we’re talking about using functions. Each step of the way we ended with improvements to our ability to compose programs out of smaller parts of programs and to reuse old code. Concepts like polymorphism and code reuse are core to programming in general.
So, having the ability to reuse things like grep
and sed
, in conjunction with whatever small applications you write using pipes and FIFOs and whatnot is an obvious Good Thing™. Yet, as powerful as the abstractions and metaphors which Unix provides are, they aren’t nearly as powerful as the abstractions provided by ‘actual’ programming languages. Furthermore, as I mentioned above, with constant innovation in programming language design, the gap is widening.
I remember at one point attempting to work with the Scheme Shell (Scsh). I was interested in Lisp at the time and Scsh seems like a neat idea. I ended up finding it a little impractical for my uses (I was looking for an interactive shell and scsh is definitely not that,) but reading Olin Shiver’s whitepaper on his shell A Scheme Shell was illuminating, this quote especially:
The really compelling advantage of shell languages over other programming languages is the first one mentioned above. Shells provide a powerful notation for connecting processes and files together. In this respect, shell languages are extremely well-adapted to the general paradigm of the Unix operating system. In Unix, the fundamental computational agents are programs, running as processes in individual address spaces. These agents cooperate and communicate among themselves to solve a problem by communicating over directed byte streams called pipes. Viewed at this level, Unix is a data-flow architecture. From this perspective, the shell serves a critical role as the language designed to assemble the individual computational agents to solve a particular task.
As a programming language, this interprocess “glue” aspect of the shell is its key desireable feature. This leads us to a fairly obvious idea: instead of adding weak programming features to a Unix process-control language, why not add process invocation features to a strong programming language?
The key point that Prof. Shivers is getting at here, is that shell scripts are programs where the fundamental units of computation are small programs rather than functions or procedures. On the one hand shell scripting is hugely powerful because the ability to compose applications unifies your entire system rather than just the components in your favorite language and its available libraries. On the other hand, shell scripting is fundamentally flawed because the abstractions generally available in shell scripts are inferior to those in proper programming languages. Prof. Shiver’s attempted to unify these worlds by writing Scsh, which adds Unix shell features to a proper programming language, Scheme.
Ultimately, although Scsh gets some use it hasn’t actually supplanted the Bourne Shell.2 That’s not to say, however, that shell scripting hasn’t actually been supplanted. Outside of a few traditional uses such as init scripts,3 most of what was once done with Bourne or Bash now tends to be done with something like Perl or one of its spiritual successors such as Ruby or Python. Ultimately, Olin Shiver’s plan to instead of adding weak programming features to a Unix process-control language, … add process invocation features to a strong programming language
worked. It’s just that Perl beat him to the punch. Though, I think that there is more to Perl’s success than just that.
Those of you who know their history,4 remember that Perl started life as an improved version of Awk. Its programmatic underpinnings were not that strong, at least at first. They did improve over time. The thing that really pushed Perl over the edge though, wasn’t the introduction of objects or references, but the introduction of modules. You see, the real problem with shell scripting isn’t the inadequacy of the Bash equivalent of a for
loop, but the limitations of using pipes and other Unix forms of IO redirection to compose programs. Perl modules solved this problem.
To illustrate what I mean, try this thought experiment: How does one interact with a relational database from a script using only the normal Unix metaphors? There were a number of approaches depending on the tools your database happened to provide, but commonly what you might do was write a script with Expect. Expect is a DSL for driving shells and other terminal applications. You’d start up your database shell and your script would sort of talk to it by watching for strings corresponding to prompt and outputs to appear. If this sounds error prone, yes it was and so were the other options of sending SQL strings directly and parsing the output manually using a combination of Awk and Sed. If you’ve done any programming with a modern programming language however, you’ll note that working with databases absolutely does not work this way anymore. Instead your language likely has a pluggable API. Perl in particular has DBI, which defines a stable API against which a programmer can write Perl scripts. The database can be interacted with at the level of a programming language rather than the level of text stream and this is a very good thing. A well defined and stable binary API nearly always beats a poorly defined text API.5
And that’s what Perl provides that makes it so popular. In fact, it’s famous for this. Using CPAN, allows one to spend one’s time composing modules rather than application, without in turn cutting one off from those applications if needed. Perl works fine as a tool for composing applications, and actually integrates that use into its syntax better than nearly any other programming language, but the fact that later scripting languages de-emphasize this feature suggests to me that the computer community in general has implicitly accepted my premise: Composing modules with procedure calls is superior to composing programs with IO redirection. The adoption of Perl signified a huge shift away from the default metaphors of the Unix environment and later scripting languages have continued the shift. Those metaphors where fine for a while, even revolutionary, but they eventually proved inadequate. Attempts to improve upon then, such as Plan 9 from Bell Labs, and its successors, ultimately failed to gain traction, but the use of Perl layered on top of those Unix abstractions, won out in the end.
That said, there is one significant disadvantage to this approach: There is a lot of duplicate effort in the modules between different programming languages and libraries. Some of this is unavoidable, and one often wants an API adapted to the semantics of his chosen programming language and some of it illusory, seeing as many modules are really just FFI bindings to a lower level C-library, but much of it is real. As a result, we’ve lost one of the chief advantages of the old Unix way: genuine polyglot programming. Pipes don’t care in what language your application is written. This of course leads to an otherwise unnecessary separation of effort as well as an increase in complexity of a complete system in terms of number of moving parts. It’s a small cost I think, given the gain, but an unnecessary one if we could somehow make a number of major changes to Unix, standardizing the inputs and outputs of programs. I think I’ll ruminate about more at a later date.
- This is somewhat mitigated by the fact that most of these tools should be available in your base system, of course. ↩
- Or Bash, really, these days. ↩
- But look at systemd and its equivalents! ↩
- Or were active programmers at the time; I don’t judge. ↩
- The one real advantage of a text API is that when it’s poorly defined, you can use standard tools such as a text editor and
telnet
to inspect it. Reverse engineering a binary API is more difficult. But make no mistake, if you have to read the text output of a program to figure out how to make a script which uses it, you are reverse engineering the API. You have no guarantee that your visual inspection of the program has resulted in a correct understanding of how it works. ↩