David A. Wheeler's Blog

Sat, 17 Jun 2006

Readable s-expressions and sweet-expressions: Getting the infix fix and fewer parentheses in Lisp-like languages

Lisp-based programming languages normally represent programs as s-expressions, where an operation and its parameters are surrounded by parentheses. The operation to be performed is identified first, and each parameter afterwards is separated by whitespace. So the traditional “2+3” is written as “(+ 2 3)” instead. This is regular, but most people find this hard to read. Here’s a longer example of an s-expression - notice the many parentheses and the lack of infix operations:

 (defun factorial (n)
   (if (<= n 1)
       1
       (* n (factorial (- n 1)))))

I think there’s a small resurging interest in Lisp-based systems, because Lisp is still very good at “programs that manipulate programs”. The major branches of Lisp (Common Lisp, Scheme, and Emacs Lisp) have not disappeared, after all. And I recently encountered a very cool and very new language in development, BitC. This language was created to write low-level programs (e.g., operating system kernels and real-time programs) that are easy to mathematically prove correct. I learned about this very cool idea while writing my paper High Assurance (for Security or Safety) and Free-Libre / Open Source Software (FLOSS)… with Lots on Formal Methods. BitC combines ideas from Scheme, ML, and C, but it’s represented using s-expressions because it’s easy to manipulate program fragments that way. I don’t know how well it’ll succeed, but it has a good chance; if nothing else, I don’t know of anyone who’s tried this particular approach. The program-prover ACL2 uses Common Lisp as a basis, for the same reason: program-manipulating programs are easy. The FSF backs guile (a Scheme dialect) as their recommended tool for scripting; guile gives lots of power in a small package.

But many software developers avoid Lisp-based languages, even in cases where they would be a good tool to use, because most software developers find s-expressions really hard to read. S-expressions are very regular… but so is a Turing machine. They don’t call it ‘Lots of Irritating Superfluous Parentheses’ for nothing. Even if you can read it, most developers have to work with others. Some people like s-expressions as they are - and if so, fine! But many others are not satisfied with the status quo. Lots of people have tried to create easier-to-read versions, but they generally tend to lose the advantages of s-expressions (such as powerful macro and quoting capabilities). Can something be done to make it easy to create easier-to-read code for Lisp-like languages - without spoiling their advantages?

I think something can be done, and I hope to spur a discussion about various options. To get that started, I’ve developed my own approach, “sweet-expressions”, which I think is actually a plausible solution.

A sweet-expression reader will accept the traditional s-expressions (except for some pathological cases), but it also supports various extensions that make it easier to read. Sweet-expressions are automatically translated into s-expressions, so they lose no power. Here’s how that same program above could be written using sweet-expressions:

 defun factorial (n)         ; Parameters can be indented, but need not be
   if (n <= 1)               ; Supports infix, prefix, & function <=(n 1)
       1                     ; This has no parameters, so it's an atom.
       n * factorial(n - 1)  ; Function(...) notation supported

Sweet-expressions add the following abilities:

Indentation. Indentation may be used instead of parentheses to start and end expressions: any indented line is a parameter of its parent, later terms on a line are parameters of the first term, lists of lists are marked with GROUP, and a function call with 0 parameters is surrounded or followed by a pair of parentheses [e.g., (pi) and pi()]. A “(” disables indentation until its matching “)”. Blank lines at the beginning of a new expression are ignored. A term that begins at the left edge and is immediately followed by newline is immediately executed, to make interactive use pleasant.
Name-ending. Terms of the form ‘NAME(x y…)’, with no whitespace before ‘(’, are interpreted as ‘(NAME x y…)’;. Parameters are space-separated inside. If its content is an infix expression, it’s considered one parameter instead (so f(2 + 3) passes the its parameter, 5, to f).
Infix. Optionally, expressions are automatically interpreted as infix if their second parameter is an infix operator (by matching an “infix operator” pattern of symbols), the first parameter is not an infix operator, and it has at least three parameters. Otherwise, expressions are interpreted as normal “function first” prefix notation. To disable infix interpretation, surround the second parameter with as(…). Infix expressions must have an odd number of parameters with the even ones being binary infix operators. You must separate each infix operator with whitespace on both sides; precedence is supported. Use the “name-ending” form for unary operations, e.g., -(x) for “negate x”. Thus “2 + y * -(x)” is a valid expression, equivalent to (+ 2 (* y (- x))). Infix operators must match this pattern (and in Scheme cannot be =>):
```
    [+-\*/<>=&\|\p{Sm}]{1-4}|\:
```

I call this combination “sweet-expressions”, because by adding syntactic sugar (which are essentially abbreviations), I hope to create a sweeter result.

For more information on sweet-expressions or on making s-expressions more readable in general, see my website page at http://www.dwheeler.com/readable. For example, I provide a sweet-expression reader in Scheme (under the MIT license), as well as an indenting pretty-printer in Common Lisp. In particular, you can see my lengthy paper about why sweet-expressions do what they do, and some plausible alternatives. You can also download some other implementation code.

I’ve set up a SourceForge project named “readable” to discuss options in making s-expressions more readable, and to distribute open source software to implement them (unimplemented ideas don’t go far!). I will probably need to work on other things for a while, but since I had this idea, I thought it’d be a good idea to write the idea and a quick sample demo of it, so that others could build on top of it. There hasn’t a single place for people to discuss how to make s-expressions more readable.. so now there is one. There are a lot of smart people out there; giving like-minded parties a place to discuss them is likely to produce something good. If you’re interested in this topic, please visit/join!

path: /misc | Current Weblog | permanent link to this entry

Learning from the Masters

If you want to learn something, study what the masters do. To me that seems obvious, and yet many don’t do it. Perhaps we simply forget. So let me inspire you with a few examples…

I just got an advance copy of David Shenk’s “The Immortal Game: A history of chess” - and I’m referenced in it! Which is an odd thing; I don’t normally think of myself as a chess commentator. But I do like the game of chess, and one of my key approaches to getting better is simple: Study the games of good players. I’ve even posted a few of the games with my comments on my web site, including The Game of the Century (PGN/Text), The Immortal Game (PGN/Text), The Evergreen Game (PGN/Text), and Deep Blue - Kasparov, 1996, Game 1 (PGN/Text). It’s my Byrne/Fischer writeup that was referenced in Shenk’s book. But I didn’t create that stuff for a book, originally. I can’t play like these great players can, but I get better by studying what they do. In short, I’ve found that I must study the work of the masters.

There are many children’s educational philosophies that have, at least in part, the notion of studying good examples as part of education. Ruth Beechick’s “natural method” for teaching writing emphasizes starting by copying and studying examples of great writing. She even notes Jack London and Benjamin Franklin started by studying works they admired. Learning begins by studying the work of the masters.

I often write about free-libre/open source software (FLOSS). In part, I do because it’s one amazingly interesting development. But there are other reasons, too. Some developers of FLOSS programs are the best in the business - you can learn a lot by seeing what they do. In short, one important advantage of FLOSS is that it is now possible for software developers to study the work of the masters.

I recently wrote the article High Assurance (for Security or Safety) and Free-Libre / Open Source Software (FLOSS)… with Lots on Formal Methods (aka high confidence or high integrity) (I gave it the long title to help people find it). Here, I note the many tools to create high assurance software - but there are precious few FLOSS examples of high assurance software. True, there are very few examples of high assurance software, period, but where are the high assurance software components that people can study and modify without legal encumberances? (If you know of more, contact me.) That worries me; how are we supposed to educate people how to create high assurance software, if students never see it? People do not wake up one morning and discover that they are an expert. They must learn, and books about a topic are not enough. They must study the work of the masters.

path: /misc | Current Weblog | permanent link to this entry