Thoughts on Lisp #0: what is code?
Programming languages are not only tools for commanding computers, but also for communicating and structuring our own thinking; and they shape our reasoning about certain problems in their own way.
In this post I reflect on a few characteristics of Lisp (and specifically Racket) that capture my mind again and again.
Syntax vs data
When I write, say, Python (which is a great language!), I tend to have this mental image of writing a certain syntax that’s then parsed into the actual logical structure (the abstract syntax tree or AST) of the program by the interpreter / compiler, similar to how HTML is translated into the DOM in memory. Of course, I rarely consciously think about that distinction, but when switching to/from Racket I notice this nuance. The syntax is just an arbitrary set of rules to write out the program in this specific form, and alternative syntaxes for Python exist.
In contrast, when I write Racket, conceptually it feels like I’m typing out a data structure that represents the algorithm(s) of the program (very close to the AST). Obviously there are still many translation steps between the input I provide and the result (a running program) inside the runtime; but the relationship between the source code and the actual program is somewhat special.
The famous parentheses in a Lisp program map directly to its logical tree-like structure.
Let’s take a very primitive example:
Python:
1 + 2 + 3 ** 2 * 4
Racket:
(+ 1 2 (* (expt 3 2) 4))
A few things to note in the Racket example:
- The computation is directly dictated by the structure of the
expression, delimited by pairs of
(...)
, and not some other factor like operator precedence - i.e., the way it is written expresses how it is computed; - Most expressions follow the pattern:
(<OPERATION> [<ARGUMENT> ...])
; - Arithmetic operators like
+
,-
,expt
are not special syntax but regular functions: the expression(+ 1 2 3)
means apply the function+
to arguments1
,2
,3
.
Thus, there is a symmetry in Racket between writing (+ 1 2 3)
and,
say, (build-path "/" "tmp" "somefolder" "somefile.json")
- both are
expressions applying functions to values, and returning values.
Also, just like any function, +
can be passed to other function as
an argument. The following expression:
(map (curry + 5) '(5 10 15))
results in '(10 15 20)
. The single quote before (
is a shorthand
for (quote ...)
and means that what follows is not evaluated as a
function call, but returned as a list instead (we will talk about this
more below).
Code as a tree
Let’s look at another example, say, computing whether a given number is a fibonacci number:
Python:
from math import isqrt
def is_fibonacci(num):
if num >= 0 and float(num).is_integer():
intermediate = 5 * num ** 2 # save some repetition
if (is_square(intermediate + 4)
or is_square(intermediate - 4)):
return True
return False
def is_square(num):
return num == isqrt(num) ** 2
Racket:
(define (fibonacci? num)
(cond ; conditional
[(and (integer? num) (>= num 0)) ; clause
(define intermediate (* 5 (expt num 2)))
(or (square? (+ intermediate 4))
(square? (- intermediate 4)))]
[else #f])) ; clause
(define (square? num)
(equal? num (expt (integer-sqrt num) 2)))
Again, in the Racket program nested parentheses define the “shape” of
the computation. Pairs of [...]
are syntactically equivalent to
(...)
and used only by convention.
Conditionals like cond
(a generalized if
), and
, or
, etc,
follow the same pattern: (<OPERATION> <INPUTS>)
, and also produce
a value. However, specific conditionals have specific shape of their
inputs, for instance cond
takes one or more clauses mapping tests to
values.
Since conditionals are expressions that evaluate to values, there is
no need for explicit return
- the evaluated expression is the
value:
(and ...)
tests if none of its subexpressions are false (#f
) and returns the value of the last one or#f
otherwise;(or ...)
returns the first non-false subexpression or#f
if none;- and
cond
returns whatever the last expression of the matching clause returns.
Functions return their last (outermost) value.
Overall, the program looks more like a tree of expressions that produce values to give the final answer, rather than a chain of actions or commands. This mental attitude is something I admine in Lisp.
Code as data, data as code
One last bit I’d like to touch on here. The source code in Lisp is actually a data structure, not just a particularly shaped long string.
For instance, with this command (executed in unix shell):
$ echo '(+ 1 2 3)' | racket -e '(read)'
'(+ 1 2 3)
…I have just parsed a chunk of source code "(+ 1 2 3)"
into data -
a list consisting of the symbol +
and there numbers 1
, 2
and
3
, but have not executed it, just printed it back as data
again. This small data structure can also be executed:
$ echo '(+ 1 2 3)' | racket -e '(eval (read))' # this is what happens normally
6
Notice how, conceptually, it’s not the source code that is executed,
but the data structure - a list of items - that is read
from the
source code.
Reading is a separate, distinct operation from executing.
Nothing stops you from doing something with the data before executing
it. For instance, running the following program would produce the
familiar "hello world"
:
(define my-program
;; the single quote before the list means `read` that thing
;; as data but do not `eval` it yet.
'(string-append "world" " " "hello"))
(eval (cons (first my-program) (reverse (rest my-program))))
In the above example I reversed all but the first element of
my-program
before executing it.
Now, this is obviously not what most Lisp programmers would normally do. But the fact that your program is a data structure changes how you think about it.
You can also read
things you don’t plan executing at
all. For instance, you can point the read
ing machinery of the
interpreter at a configuration file without ever executing what’s
inside, just using it as passive data.
Nevertheless, writing programs that transfrom themselves before being executed is very common in the Racket world, although this is done in a much cleaner and more declarative way using macros.
But why, you ask? Because programmers love abstractions and macros provide a way to abstract out patterns of code that emerge in this or that specific domain (or accross domains!).
I’ll end this post with a final example: a quite popular instance of macro use in Racket, providing a nice way to “pipe” values sequentially through a chain of operations:
#lang racket ; selecting the main Racket dialect
(require threading)
(~> "world hello"
(string-split " ")
(reverse)
(string-join " ")
(string-append "💜"))
The (~> ...)
fragment actually translates to:
(string-append
(string-join
(reverse
(string-split "world hello" " "))
" ")
"💜")
The ~>
form is not a language feature or special built-in syntax - it
is a macro coming from a a third-party package. Macros
are Lisp functions that transform Lisp code before it will be executed.
As you can see, the language allows great creativity and flexibility in applying itself (code) to itself (data), so to speak.
(Recursion in general is also practiced quite often by Lispers, but that is a topic for a dedicated blog post).