Eric Raymond’s speech: Linucon 2004: The Basics of the Unix Philosophy

At Linucon 2004 Eric Raymond gave a speech on the basic principles of the Unix philosophy. They are the same principles as described in the “Basics of the Unix Philosophy” chapter of Raymond’s book “The Art of Unix Programming”. Since the book is available online, I put a link to each rule so you could compare what’s said in the book with what was said in the speech. He dwelled a little longer on each rule and gave more examples than he does in this chapter of the book. At the end of the speech he answered questions, some more, some less related to Unix philosophy. He also ranted on XML (after admitting not having an opinion about it) and expressed his opinion on Hurd.

Linucon was a joint science fiction and Linux convention. It took place in October 2004 in Austin, TX. The most prominent guests was Eric Raymond and Jay Maynard, a.k.a The Tron Guy.

Rule of Composition: Design programs to be connected with other programs.

ESR. [You must follow] certain very general guidelines about your programs emit output and accept input, so that in principle it’s possible to hook them up to other programs.

Here’s a good example. Let’s suppose you write a program whose job is to emit a table of some sort. Doesn’t matter what kind of table it is or what the semantics of the data is. The UNIX way says that the default for that program, or at least an option, should be to emit the raw data of the table without explanatory headers. And the reason you emit without explanatory headers, is that while those explanatory headers may be very useful to a human being in reminding of the semantics of the data, a program downstream is gonna have to throw them away, and in order to throw them away would have to be able to parse arbitrary text. So if you want your table-emitting program to be easily composable with other programs, what you do is you make its default operation, or at least some option, a mode in which it just shoots out the raw data in a regularly parseable form and doesn’t add any noise to it. That’s the kind of guideline I’m talking about. You’ll read a lot more about it [in Chapter 7 of “Art of Unix Programming”?]

So, design programs to interact with other programs.

This rule of UNIX tradition has lead to a widespread misconception that UNIX programmer hate GUIs. We don’t hate GUIs! GUIs are fine when you are talking to human beings. The problem is, GUIs suck for talking to other programs. It’s very hard to parse a pixel change in a GUI display.

An audience member mentions screenscrapers.

ESR. Oh, screenscrapers are horrible! Horrible! You don’t want to scrape screens. It’s particularly difficult to scrape screens when they are bitmapped. You don’t want other people to have to do that. And that’s why Unix programmers have a reputation of being averse to GUIs. That’s one reason, I’ll get to another reason shortly.

It’s very difficult to make a composable GUI. That’s why you’ll see [Unix programs outputting text streams]

Rule of separation. Separate policy from mechanism, separate interfaces from engines.

When you look at how programs evolve under pressure from changing user requirements, interfaces tend to change more, and more rapidly, than engines do. The way people want buttons arranged, the output formats they want, the input formats they want, tends to change more rapidly than core algorithms like, oh, tree traversal, or pulling something out of a relational database.

Because interfaces and engines change at different rates, if you write a monolith that welds interface to the engine, what’s gonna happen is that over time there’s gonna be demand for changes in the interface, and when changing the interface, you either gonna break the engine, or you’re gonna throw it away entirely and rebuild the whole thing. Whereas if you do the UNIX thing, which is to have your program be two cooperating pieces, one of which is engine, with a brutally simple, often textual protocol, and another one which is interface, which may be a GUI that’s just a shell around that engine, and takes mouse gestures and button clicks, and translates them into simple commands back to the engine, and the engine spits back output that the GUI shell interprets — if you do that, all of a sudden it’s possible for your GUI to change independently of your engine, and now they can evolve at different rates, and that’s OK. You’ll end up having to throw away less code.

So that’s a very powerful pragmatic reason to separate interfaces from engines.

Eric Raymond gives a speech "The Basics of the Unix Philosophy" at Linucon 2004, Austin, TX, 2004
P0000554 Eric Raymond gives a speech

And for a parallel distinction, which is separating policy from mechanism: mechanism is how you do things, policy is what you choose to do. Just as interfaces tend to change faster than engines, policy tends to change faster than mechanism. Usually, policy is associated with interface, and mechanism is associated with engine. The correspondence is not perfect, but there is a tendency to correlate that way.

In the UNIX world, probably the biggest and the most instructive example of this is the way we handle interactive graphics. Where most other operating systems have monolithic GUIs, in which there is a baked-in interface style that’s welded to an engine that goes all the way down pushing bits around on the screen. Any time you want to change interface policy, it has ripples that tend to propagate through the entire system and break things. Conversely, changing the engine is also very difficult without changing the visible appearance of the system.

On the other hand, in UNIX land, what we have is a couple of distinct players. We have X-Windows, which is basically just a mechanism for pushing bits around the screen, and we have toolkits which impose a particular set of policy choices, and user appearance, and feel, and look, on X; and then on top of those toolkits we have applications. And because those layers are cleanly separated, we can do some things that a monolithic graphic system could never do.

Here’s a good example. There was a primitive toolkit, very widely used, called the Athena toolkit. It was a demonstration toolkit for X. It had a very simple, rather ugly, set of two-dimensional widgets for buttons, panes and scrollbars and so forth, and so on; and there are a lot of Athena applications out there that sort of started to look old and crufty because people were moving towards toolkits with nice 3-D burnished widgets. But instead of throwing away all that code, there was a clever hack, which some people [came up with], and they wrote a toolkit — I don’t remember a name right now — with exactly the same interface, exactly the same API, as Athena toolkit, but with 3-D widgets. It just had a better skin on it. And suddenly we didn’t have to throw away all those old Athena applications, we could just link them with this different toolkit, and suddenly they looked nice and clean and modern, and everybody’s happy with them again.

In a monolithic system you can’t do that. That was only possible because the policy layer, the toolkit, was separated from the mechanism layer.

Rule of simplicity. Design for simplicity, add complexity only where you must.

This sounds really like a stupid, simple and obvious thing to say, but I want you to think about the economics behind it.

Complexity is an overhead. Complex algorithms, tricky data structures, funny contingent interactions with the environment are a problem because they are difficult to maintain. Often it’s not only difficult to document what that complexity is, but it’s difficult to document what the assumptions behind it are, why you did things that way. And when the environment changes and one of these complex things breaks, it’s very difficult to fix.

Behind this there is another implication you should be catching, which is that the true UNIX programmer thinks of his code as having a lifetime measured not in months or years, but in decades! There is code that was written for version 7 back in 1979 that is still in routine use today. Proper UNIX programmer thinks of expected life of his code as being 30, 40, 50 years. Because of that, if you’re doing things right, you almost obsess about designing for maintainability and continuity into the future, and situations in which most of your technical assumptions will have been violated. That’s one reason why you want to design for simplicity, because everything complicated and tricky that you do with the code is an iceberg for people to crack up on, in years.

I see people nodding in the audience, and there’s an interesting pattern here: the probability that someone would be nodding at this is directly proportional to their age. (Audience chuckles.)

A very related insight: Rule of Parsimony.

Write a large program only when it is clear by demonstration that a small program would not do. It’s not enough to believe theoretically that a small program must do. You must try to write a small program and fail before you are justified in writing a big one.

And after the amount of time I’ve just spent ranting about complexity, I trust the reason to be obvious: all other things being equal, the internal complexity of a program is proportional to the square of its size. Why do I say that? Well, a fundamental insight about bugs in the programs is that most of them consist of unanticipated interactions between different parts of the program. So as the line count goes up, the number of possible destructive interactions rises as the square of the line count.

A lot of the ways in which UNIX programmers obsess about complexity control and simplicity are aimed at dealing at that quadratic curve of rising complexity. If you keep all your line counts small and you have well defined simple interfaces between programs, you’re isolating the complexity, you’re keeping it in small pieces, so that the square function doesn’t have a chance to really blow up on you.

Here’s one [rule] that’s a little more concrete: Rule of Transparency. Design for debuggability. Design for transparency upfront, so that you save debugging effort later.

The economics behind this is also very simple. Very consistently, when we work, we tend to think of the lifecycle of code in terms of “you write the code, you ship it, and you’re done.” Reality isn’t like that. If you look at the lifecycle of code, 75% of it is maintenance after it is shipped. So minimizing the maintenance that’s necessary or debugging that’s necessary is where the real gains are. And this implies that it’s worth spending effort to create transparent places, portholes in your code, where you can look inside and see what’s going on. You want to make the internals of your code monitorable, so that when there’s some kind of bug, you could say “Hmm, if I switch on this bit of logging and that bit of logging and the verbose switch, I will see exactly what my transactions are, and the nature of the breakage will become very obvious when I look at the logs. However, this doesn’t happen if you can’t make logs.

So an important UNIX design principle — designing for debuggability — your debug switches and your verbose flags should not be afterthoughts or accidents. They should not be something that you glue on after you implemented your algorithm and designed your data structures. You should be thinking about your verbose flag, about your debugging switch, while you’re designing your data structures. In fact, you should design your data structures to be debuggable. You should design them so that, for example, when you design a class to manipulate some data structures, always, always, always, always, always, include a “cat” class method, whether you
think it will ever be used or not. Because if you think it will never be used, you are wrong. Always design a class method that dumps the internal structure of a class in a readable ASCII format. Because the very first time you have to debug a problem that is traceable to the internals of that class, having that “dump” method is gonna save your ass. Trust me on this!

Same thing goes for… Actually, I have a favorite example of this that I like to point out. Some of you may know that I’m the maintainer of a program called fetchmail. It’s a program that’s used to retrieve mail from POP and IMAP sites and [feed it into] your local mail stream so your MTA can see it. The most important single switch in fetchmail is -v, the verbose switch. What this does is when you run fetchmail with the verbose switch, as it does POP transactions with the remote mail server, and as it does SMTP transactions with your local SMTP listener, every single line of text is dumped to standard output in the proper order, so you can watch it scroll by. And I’m here to tell you, that 8 out of 10 of all reported fetchmail bugs are instantly characterized when I see a log of those transactions. 8 out of 10 times it is immediately obvious what is broken and how to fix it.

There are 2 good things about the verbose switch: one, that it makes it easy for me to spot and localize breakages in the code; the other good thing is that because this information is dumped in a textual transportable format, one of the things I can say to people is that when you have a bug, mail me a session log. That’s part of the standard bug reporting procedure, it’s documented on my site: when you find a fetchmail bug, reproduce it and mail me a session log.

So a typical fetchmail debug cycle looks something like this: Somebody sends me a piece of mail, saying “my fetchmail is broken and I don’t quite how. Here’s the log.” I look at the log, I see: part of the log is the operating system, the fetchmail version, and identification of the SMTP listener they’re using; and then I look at the transactions scheme. And 8 out of 10 times I can say “you’re misconfigured, and this is how”, or “Oohh! Aaah! It doesn’t parse that case of address very well, but I can fix that in about 15 minutes.”

This is the way you want to design your code. You want to design it so that it could dump enough of its internal state to make debugging easy. That would save you so much time! If you think the effort to put in extra transparency isn’t worth it, trust me when I say: “penny wise, pound foolish”.

An audience member observes that logs should be stripped of security-sensitive information before they are shipped.

ESR. I did! That’s a good point: the one thing you should not put in debug logs is security-sensitive information. So with -d option I make a point: it knows when the thing might be emitting security-sensitive information, and it x’s it out. That means it’s safe for anyone to ship around session logs.

Let’s move on to Robustness! Robustness is the most valuable property a program can have, which is that it is resilient under unexpected stresses and inputs. You can feed it weirdness and it’ll digest it without burping.

How do you get robustness? Robustness has two parents, simplicity and transparency. If you make your code simple, if you avoid complexity and gingerbread and gewgaws in the algorithms and data structures you use, and make it transparent, debuggable, and make it easy to see the internal state, you will have robust code. This is not black magic. If you crank those inputs properly, you will get robustness out.

Next point. Ah, yes, here we go. Rule of Representation! Fold as much knowledge as possible into data, so that program logic can be stupid. Stupid programs are goooood. Stupid is good. Smart data is good. And here’s why. It is easier to reason about. visualize and debug data layouts than it is to debug code. Code is hard. Human minds are not really very well adapted to visualizing things like loops and conditional statements. And if you think about how you visualize them, what do you do? You turn them into graphs. You turn them into data.

It is much easier for a human minds to reason correctly about a 50-node tree, or a 50-arc graph, than it is for us to reason about a 50-line procedure. It follows as the night the day, that the smartest way to write your code is to put all the smarts in the data structures, the things you can visualize and manipulate easily, and have code to be like a simple-minded little insect that’s basically crawling over the data structure and modifying it in various simple-[…] ways.

So yes, make your data smart and your code stupid. This works much better than the other way around — dumb data, smart code.

Rule of Least Surprise: in the interface design always do the least surprising thing.

A very concrete application of this is that if you are writing a calculator program, addition should always be signified by the “+” sign. Now, this is a mistake most programmers probably wouldn’t make. The flip side of this is that when [you are designing interfaces], you always want to think very carefully about user’s preexisting knowledge of the problem domain. What are the user’s expectations? What does the user think he or she knows? Use that tacit knowledge, don’t fight it. When you bind the addition operation in the calculator to a “+” sign, you are using the user’s expectations about what mathematical notation looks like, rather than fighting them.

The point about this is you should resist the urge to be clever and novel. Clever and novel algorithms, clever and novel engines, are often a good idea. Clever and novel interfaces are almost always failures. The reason they’re almost always failures is that the interface that’s really clever and novel and groundbreaking doesn’t connect to anything in the user’s previous experience. And it has to do that, because an interface is an impedance matching device between the user out there and the engine right here. So a significant part of your job is making that interface comfortable for the user — that means, matching the user’s expectations, that means following the rule of least surprise.

Rule of Silence. This is one with an interesting history. When a program has nothing interesting to say, it should shut the fuck up. (Audience chuckles.) That’s not how I put it in the book. The way I put it in the book is “if a program has nothing interesting to say, it should say nothing”.

You should permit your user the luxury of knowing that when you program doesn’t bother him, it’s working correctly. It only bothers him when something is going wrong or some unexpected conditions come up. There is a limited exception to that: in programs with long latencies it’s nice to have progress bars. But even progress bars should be as unobtrusive as possible. They should be something that allows the user to get a handle on progress, not jumps up and down and screams “Look, look what I’m doing!”.

Originally the reason for the rule of silence was that UNIX evolved in an environment of very low speed, low bandwidth IO devices — 100-[something] character a minute dot-matrix printers. And you wanted programs that were operating correctly to NOT print out large volumes of text simply because it took too long! It took too long.

The Rule of Silence surived and continues to have utility today for a different reason. Nowadays we [don’t use] 110 baud printers anymore: we look at GUI displays with all kinds of widgety things on them: graphics, progress bars, […] and so on. So what you save when you make programs quiet is the user’s bandwidth. Human attention is the scarcest resource there is. Human beings can only pay attention to one thing at a time, so your job as a programmer is NOT to graaab the user’s attention and say “here, look at my program!” It’s to help the user allocate his or her attention where the USER thinks it belongs! That’s why you don’t grab the user’s attention unless you need to! Unless there is an actual requirement that the user react to this exception condition right now. Which means, at other times, if the program operates normally and doesn’t have anything interesting or startling to say, it should shut up!

We just had a rule about when to be silent, and now we’ll have a rule about when to be noisy. The Rule of Repair. When you have some kind of fault or exception condition, cope with it and repair it if you can; if you cannot, fail as quickly and as noisily as possible. The reason for this is, if you fail soon and you fail quickly, the user knows they have to do something. On the other hand, if you fail late and quietly, your program can end up doing arbitrarily large amounts of secondary damage that wouldn’t have happened if the user had a chance to intervene.

So, under many circumstances, if you have some kind of exception condition that a program simply cannot deal with, that puts it in an internal state that’s impossible, the right thing is to say loudly (makes a gasping-retching noise) “Boss, I’ve got troubles!” and then quit. So if you have to fail, fail soon, fail loudly.

Rule of Economy. Programmer time is precious: conserve it in preference to machine time.

Human beings should not do jobs that computers can do. Human beings are too expensive for that. Now, I am just barely old enough, he said doing his doddering old fart imitation (Yes, those were ESR’s words, not mine! — E.)… I’m just old enough to remember when this rule was a bad one. It would have been a bad thing in the bad old days of expensive computers. Computer time was sufficiently scarce that it had to be allocated according to very rigorous schedules. And it made sense to throw human labor at problems in order to make it simple, so that the expensive part of the process — running it through the mainframe — would be affordable. This is the economic imperative that gave rise to horrifying monstrosities like punch card programming, which I am just old enough to have done. If anybody tries to tell you that those good old days were good old days, kick them in the teeth. They were awful.

The computers were so expensive that it made sense to throw labor at problems rather than computer time. Now we are definitely living in a world that’s mostly […] Computers are so cheap that you do not want people doing what computers do.

It has some implications that may be surprising if you haven’t thought it through. One is, in today’s environment, compiled languages like C that don’t do your memory management for you suck pretty hard. They are good things to avoid. Now, I say this as a person who is really, really good at C. I knew it for a long time, and not that many people were better than it than I am. Nonetheless, I’m standing in front of you and saying that all that knowledge that I’ve acquired over 25 years of programming, under modern conditions it’s a stupid thing for me to apply most of the time.

There are some exceptions. If you’re writing an operating system kernel, you need absolutely the closest to the machine-optimized performance you can get — that sort of thing should be written in C. So there are some things that are entailed by that: you need to write device drivers in C. There are some kinds of service libraries, like for pushing graphics pixels around, that make sense to write in C for the same reason: you really need to crank the maximum benefit out of the hardware.

So you should really be using something like a scripting language, like Python, Perl or TCL, or, if you’re as old as I am, LISP.

Question from the audience: why have programmers held on to malloc for so long?

ESR. I think it’s inertia, basically. Programming culture hasn’t adjusted to […] so far. And my book is part of the process to change it. Nowadays I program in Python whenever I can. It has a big advantage that Python code tends to have the best long-term maintainability. If you write it (and I collect scripting languages), there’s a good chance it will still be readable months from now.

Another way to economize programming time: Rule of Generation. Avoid hand-hacking. It’s always better to write the code from some kind of declarative specification. The density of bugs per line is independent of the language. That implies that if you’re looking at the system as the whole, if you define the height of the language the number of […] required to execute […], with increasing height of the language you’re able to accomplish more stuff in one line, so you are able to have fewer bugs in the code overall than if you wrote it in a low-level language, because the lower-level code has hundreds of times more lines. […] You may have 1 bug maybe for 10 lines of specification, as opposed to 1 per 10 lines of hand-coded parser, where there are a hundred times more lines.

So it’s worth the time, when you have a code that has some kind of mechanical pattern above the level of the language you’re using, try to find a way, if necessary, write your own specification language, and write your own tool to compile it into the code. Very, very often you’ll find that that actually gets you fewer bugs and faster time to market, than if you tried to hand-code all that stuff.

Rule of Optimization. Prototype before polishing; get it working before you optimize it. There is no error that has done more to screw up designs, make them complicated and make them buggy; there is no error to promote bad programming, than optimizing before you got it working; than obsessing about the resource you think you’re gonna be bottlenecked on, before you actually measure. The primary cause of elaborate, fragile data structures is people thinking “oh, I’m gonna run out of space”, or “oh, I’m gonna run out of time because this particular operation has to be fast”. That’s the main cause of elaborate data structures and elaborate code, and consequently, it’s the main cause of bugs! Premature optimization is the root of all evil! Don’t do it!

Think about your design, think about your functional requirements first, rough up a prototype in a scripting language, put it through some tests, and instead of speculating about and anticipating about where the performance bottlenecks are, run it and measure. Do your first implementation in Python, Perl, Ruby or TCK, and measure its performance. About 60% or more of the time you’ll discover your prototype is deliverable. You don’t have to go to C. You don’t have to do optimization. Part of the reason for that [for discovering that 60% of the time your prototype is deliverable] is that processor time and memory are getting cheaper on the Moore’s law curve, waiting 6 months will get you 25% improvement in performance for free!

Think about that. Waiting 6 months gets you 25% improvement in performance for free. The alternative is to spend 7 months coding to get that 25% improvement in performance, at which point you’re gonna used to a huge amount of complexity for a performance gain that you would have gotten cheaper and faster by buying new hardware, or by upgrading your existing hardware. So: optimization: don’t do it! Don’t do it until you’ve measured and […]

The Rule of Diversity: distrust all claims of one true way. This is a particular problem in design of computer languages, which I tend to think about a lot, because that’s what I like to do. When I get my choice, I like to design special purpose languages for application control (?).

But this principle applies elsewhere too. I used to be a big fan… I started programming in LISP. LISP is a wonderful language. Teaches you amazingly powerful ways to think about programming, think about algorithms. LISP is pretty much almost dead at this point, with one exception: there is LISP embedded inside an editor called emacs, that a lot of people write things for.

An audience member makes a comment that could not be heard.

ESR. Oh, I see we have a vi heretic here! Remember, you can’t spell “evil” without vi. [Audience snickers]

The reason that LISP is almost dead, outside of [emacs] is that LISP environments typically wanted to pull entire universe into themselves and had great difficulty talking to anything else. Whenever you do a language design… another classic example of this is PROLOG. Whenever you do a language design, where some programmer or some academic or some computer scientist gets obsessed with the idea that this one angle, this one approach on programming is supercool and can solve all the problems, it generally turns out to be a crock. And the reason isn’t that this one way isn’t a solution to a very wide class of problems: often it is. The problem is that programming language or environment can’t talk to anything else. It can’t cooperate with other programs.

A current language that very much has this problem, despite the fact that it’s beautiful and brilliantly designed, is Smalltalk. Smalltalk is wonderful — as long as you stay inside the confines of the Smalltalk world. The moment you have to step outside it and talk to something that isn’t Smalltalk, you’re screwed.

So, “distrust all claims for One True Way” means, in particular, always designing your programs and languages, your environments, so that they make good glue, so that they could talk to the rest of the world, so that they could call programs that are not inside your universe, so that they could be called by programs that are outside your universe.

And our final design principle, before our time is completely over, is Rule of Extensibility: design for the future because it will be there sooner than you think.

Whenever you write a program, whenever you design a data structure, whenever you design a language, leave room for it to grow. Have extensibility mechanisms that are built in, part of the architecture, not kludges and afterthoughts. When you define a network protocol, design it so that more verbs can be added to the future without breaking the protocol engine.

And there will always be somebody who’s gonna stress your program, or your language, or your network protocol, in a way you never even imagined. And at that point there better be a good extensibility mechanism on it, or it’s gonna break big time.


So I’ve come to the end of my design principles, so at this point I’ll just throw it completely open for questions.


Audience Voice 1. How do you reconcile the rule of silence with Linux boot screen controversy: “tell me when you’ve finished booting” vs “tell me about every service you start”?

ESR. I think the right think is what Fedora Core 2 happens to do. Fedora Core 2 has a simple boot screen that says “I’m booting, I’m booting, I’m booting”. It goes to a more detailed mode where you get to see the progress messages from the service startups, when it gets a failure, because that’s when you need to know. And that fulfills another one of the design principles, which is “when you hit an error, fail early and start dumping information about what the potential problem is. So, yeah, it’s not hard to reconcile. You let the boot process happen out of sight until the user needs to know what’s going on. Is that a useful answer?

Audience Voice 1 mumbles agreement.

ESR. Next.

Audience Voice 2. Do you have a rant on XML, how it should or should not be used in UNIX programming? Cause it seems to be contradictory of several of your principles.

ESR. No, but if you give me a minute or so, I’ll fake it. (Laughter in the audience.)

XML is clearly a good tool for some things. It is a valuable tool for data formats that are not naturally line-structured. In particular, I’m a big fan of XML-based document markups. Because that’s the kind of data that isn’t naturally line-structured. It’s naturally paragraph-structured, or naturally word-structured, but isn’t naturally line-structured. Whitespace is not really significant in those texts. So for document markup formats XML makes a lot of sense.

There is, unfortunately, a tendency for people to overuse XML in context where simple formats would be more appropriate. And in fact, I talk about that in my book; I have an entire chapter on the UNIX tradition of how to design textual formats, how to use the simplest class of format that you can make fit your data semantics. So for example, I wouldn’t use XML format for tabular data. If we know the data has tabular structure, then XML doesn’t make any damn sense: you just use whitespace-separated textual columns. Or you do the classic UNIXy thing, which I call — I had to invent a term for this, because there aren’t many […] — those of you in Microsoft land know an algorithm called CSV, Comma-Separated Values? The UNIX tradition generalization of that is DSV, delimiter-separated values. In UNIX land more often the delimiter is a colon, for various historical reasons.

So when you have tabular data, XML doesn’t make a lot of sense; you are better off with something like a DSV format.

Audience Voice 2 says something inaudible.

ESR. That’s why one of the other things that you need in UNIX tradition, which I talk about in book, is: whenever you design a DSV format, it must have an escaping mechanism. An in fact there are strong traditions in UNIX world about what that escaping mechanism should look like. It’s traditional in UNIX textual format that the escape mechanism is backslash. And there are certain standard […]: r is always a carriage return, n is always a line feed, and it’s traditional that you […], and if you are doing a DSV format, : is […]. And it is a sufficiently strong one that when I see a DSV format, I know what I expect the escapes to look like, and if they don’t look like that, it’s a bug. It’s a design flaw. For somebody operating within the UNIX tradition, if I go back and say “you forgot to cover backslash case”, they’ll go “oops, my bad”.

Audience Voice 3. More about XML. What do you think about an observation that XML is just a poser for S-expressions?

ESR. Oh, absolutely. XML is just S-expressions with angle brackets and better marking. Obviously true. Next question?

Audience Voice 4 (The question could not be heard.)

ESR. OK. Things UNIX has not got right. Well, one of the things that UNIX did right was unifying the notion of a stream device with a file. That was brilliant. That was wonderful. Howsomeever, we didn’t go far enough. There are classes of devices that could be unified with stream-like thingies, and often enough are not. I’m referring, of course, to Berkeley sockets, which do not live in the same namespace as files. This is a design error. It is possible to fix it. To learn how it’s possible to fix it, go to the Plan 9 talk tonight. Or was that last night? OK. So get into your time machine and go to Plan 9 talk last night.

So, we didn’t go far enough in unifying all of the namespaces in the operating system. Sockets should not live in any different namespace from ordinary files and devices. That’s a major blunder.

For a classic example of a minor blunder, possibly the single most egregious user interface mistake in the history of UNIX is the fact that there is a distinction between tabs and spaces in makefiles.

(The audience emits a collective “uggh”.)

ESR. I see that I don’t even need to explain that. (Audience chuckles.)

Audience Voice 4. Why isn’t Linux fixing the sockets problem?

ESR. Gradually this is happening in the Linux world. One of the things we have now in Linux people haven’t exploited it much anyway yet — is user space daemons that supply the semantics for file systems. They hook into the kernel through a driver and then the computations are actually done in user space. And one of the cute things this lets you do, which hasn’t been done yet, but it’s gotta be coming in months now, is any […] Plan 9 trick, where in order to do ftp you mount a remote site onto your local machine, and all of the ftp requests to do I/O with that site actually indirect through this userspace file system. So Linux is actually moving in that direction. We will get there. But it’s gonna take a while.

Audience Voice 6 (comment could not be heard).

ESR. That’s right. You’re absolutely right. That’s an example where UNIX gets its own principles wrong. And the designers of UNIX came back around in the early 1980s, realized it and fixed it; unfortunately, that technology has yet to percolate into the wider UNIX world. So, as I say: we will get there. I’m confident that we will get there.

Audience Voice 7. Do you have an opinion on GNU/Hurd?

ESR. Aaah, boy! GNU/Hurd! In 1996, I was on a program committee of a conference called FRS, Freely Redistributable Software. this was in Cambridge and it was sponsored by the FSF. And I remember, I spent one morning of the conference listening to a design exposition of GNU/Hurd. Keith […] who wrote the BSD license was a couple of seats away from me. He looked at me with dubious expression and asked “what you think of Hurd?” I answered, it’s beautiful, it’s elaborate, and it’s doomed. I realized I was listening to description of a system that never focused on actually building a deliverable product. Instead they focused on building a cathedral with more and more elaborate arches and gargoyles. That’s a cautionary tale for all of us.

Audience Voice 8. What do you think of OS X?

ESR. It’s beautiful, it has some cludges in the middle, and I’ll never use it because I’ll never be dependent on proprietary code.

Audience Voice 9. With acceptance of model-based architecture, UML and that kind of thing, do you believe there will be open-model software as opposed to open-source?

ESR. Can’t comment, I know nothing about UML.

There were more questions and answers, but the tape recording of them was too poor, so they remain untranscribed.)

And here are more of Eric Raymond’s rules:

Rule of Modularity

Rule of Clarity