When I was taking B521 last October, there was a particular programming exercise I had to do for a homework assignment. After a long struggle, I made my code work, but it was bad. Let me tell you all about how bad it was!
Suppose you have to deliver a small, a medium, and a large pizza to each house in your neighborhood. All of the pizzas -- all the small, all the medium, and all the large -- will fit in your pizza-delivery vehicle. But you start out with only the small pizzas. You stop at each house and deliver a small pizza. Then you go and get the medium pizzas, and repeat. Finally, you deliver a large pizza to each house.
Of course, the more efficient way to go about it would be to deliver the small, medium, and large pizzas to each house at the same time. After all, it doesn't matter in which order they receive the pizzas, just that they receive them. But you don't think of that!
That was one of the problems with my code. But the other problem was even more embarrassing.
Suppose you have small, medium, and large pants, and you have small, medium, and large drawers. You need to be able to find particular pants when you want them; it's kind of important for getting dressed. To that end, you've decided that all pants of a kind should be in a certain drawer together. Which drawer? It doesn't matter, of course. You simply need a piece of paper taped to the wall that says, for instance:
SMALL PANTS IN LARGE DRAWER
MEDIUM PANTS IN SMALL DRAWER
LARGE PANTS IN MEDIUM DRAWER
Of course, it might also be the case that your paper on the wall says
SMALL PANTS IN SMALL DRAWER
MEDIUM PANTS IN MEDIUM DRAWER
LARGE PANTS IN LARGE DRAWER
In fact, it may even be that that's always what the paper on the wall says. But you still have to check the piece of paper every time you go get some pants, because you don't know that! And every time you buy a new pair of pants and put it away, you painstakingly write out a new sheet of paper. It always says exactly what the old one says, but never mind that; you still have to do it!
I hope you're beginning to get an idea of how bad my code was. It kept a bunch of information it didn't need, which it then had to spend a lot of time maintaining; it did a lot of other unnecessary computation, too.
The particular problem I was trying to solve, by the way, had to do with calculating something called the lexical address1 of a variable. I am now going to explain lexical address, and what's more, you're going to like it, whether you like it or not.
Say you have a variable -- let's call it
smallpants. Variables can be free or bound. We'll come back to that, but for now, unless stated otherwise, let's assume that they're free -- so, at the moment,
smallpants is a free variable. You also have these things called binders, and binders are, oddly enough, things that can bind variables.
Now, variables are one kind of expression that you have. In addition to those, you also have
lambda expressions. As it turns out,
lambda expressions are binders. A
lambda expression that binds the variable
smallpants, for instance, would look like this:
(lambda (smallpants) . It means that
smallpants is bound in the area indicated by
arbitrary-stuff can be any old expression, including another
lambda expression.) But if an occurrence of
smallpants ever actually shows up in
arbitrary-stuff, we say that that particular occurrence of
smallpants occurs bound.
Here's an example. If you've got the expression
(lambda (smallpants) smallpants)
smallpants on the second line is a bound occurrence of
smallpants, because it occurs inside a
lambda expression that binds
smallpants. You could also have, say,
(lambda (smallpants) (lambda (largepants) smallpants))
lambda expressions are allowed inside other
lambda expressions, this is perfectly legal. And once again, we've got a bound occurrence of
smallpants. It's at a different level of nesting than the earlier one, but that doesn't matter; it's still a bound occurrence.
Now suppose we have, say, the expression
(lambda (largepants) smallpants)
Now we don't have any bound occurrences at all. The
lambda expression binds
largepants doesn't actually occur in the scope of the
lambda expression that binds it. And
smallpants does occur, but there's no
lambda (smallpants) binding it at any level, so we say that it occurs free.
Easiest thing ever, right? Still with me? Okay. Lexical address works like this: it's a way to talk about the level at which a variable is bound. If you've got an occurrence of a variable (let's call it
orangepants), then the lexical address of that occurrence is the number of inner
lambda expressions that are in scope between the place where
orangepants is bound by a
lambda expression and the place where
orangepants occurs. Another way to think of it is the number of times you have to jump over the word "
lambda" to get to the occurrence of the variable.
Let's illustrate this with an example. Here's a bigger expression:
(lambda (orangepants) (lambda (redpants) (lambda (shinypants) (lambda (purplepants) (lambda (greenpants) orangepants)))))
orangepants at the very bottom is a bound occurrence. It has a lexical address of 4, because there are four
lambdas in scope between the binding
lambda (orangepants) at the top and the occurrence of
orangepants at the bottom.
So, that's lexical address! And the only other thing you need to know about lexical address is that it's based on where a variable is most recently bound. So, if you've got
(lambda (orangepants) (lambda (redpants) (lambda (shinypants) (lambda (orangepants) (lambda (greenpants) orangepants)))))
then the lexical address of
orangepants is 1, because the inner
lambda (orangepants) overrides the outer one. Programmers sometimes refer to this as "shadowing" or "capturing" the outer
orangepants. People often do it unintentionally, leading to their code behaving in unexpected ways. Indeed, many bugs are caused by accidentally getting one's pants captured. But that's not the problem we're discussing today. My code wasn't wrong -- it was merely bad.
My program was trying to determine the lexical address of each variable in an expression. It did this by walking through the expression, keeping a running list of every variable it saw being bound by a
lambda expression. After seeing
orangepants bound at the top of the expression above, my program would create a list that looked like this:
((orangepants . 0))
Every time my program jumped over another
lambda, it would add an entry to the list for the variable bound by that
lambda, as well as updating all the other entries. So, the next time this list got updated while processing the expression above, it would look like this:
((redpants . 0) (orangepants . 1))
If it ran across a variable that was already in the list -- a variable that was being shadowed -- it would remove that variable's new entry and stick a new entry with a lexical address of 0 onto the current front of the list, as well as updating all the rest of the entries. So, by the time it finally got to the bottom of the above expression, the list would be
((greenpants . 0) (orangepants . 1) (shinypants . 2) (redpants . 3))
Of course, I wasn't examining the list myself at each stage of my program. The program itself dealt with this list, and the end result was correct --
orangepants ended up with a lexical address of 1, just like it was supposed to have.
But, as should now be glaringly apparent from looking at the list, this is quite a lot like having a piece of paper taped to your wall that says "SMALL PANTS IN SMALL DRAWER MEDIUM PANTS IN MEDIUM DRAWER LARGE PANTS IN LARGE DRAWER". As a result of the way the list was built, the lexical address of each variable ends up being its position in the list. Which means I could have had a list of
(greenpants orangepants shinypants redpants)
and it would have been easier to deal with and shorter and would have contained exactly the same information.
That's kind of slick, really. But I didn't think of it. Ramana Kumar, who happened to be grading my homework that week, did. He also thought of another optimization: instead of first checking whether a variable appears in the list at all (to determine whether the variable is free or bound) and then calculating the lexical address in the event that it's bound, you can just do both in one step, which saves you having to run down the list twice. This is kind of like making one trip instead of three trips to deliver pizzas to each of your neighbors' houses.
When I look back at my code, it really makes me cringe. Not only is it doing all kinds of unnecessary stuff, it's also long and unwieldy and just cumbersome as all hell. It's just bad, you guys. But! As I said, all of this happened early last October. At the time, after Ramana showed me how he would have done it, I wrote an email to Alex oniugnip.
From: Lindsey Kuper
To: Alex Rudnick
Date: Thu, Oct 2, 2008 at 12:07 AM
Subject: Fwd: using a count for lex
Ramana's so badass. His solution to this problem is so much more elegant than mine -- less code *and* less computation.
It surprises me how strongly I'm reacting to this. I'm used to being around people who are better programmers than I am, but I'm not used to being around people who are better programmers than I am *in my native language*. If I can pick up some good habits from these people, it will be wonderful.
So here we are, six months later, right? It's April. A Friday. I'm working on my compiler. A compiler is something that processes expressions and turns them into other expressions by doing transformations on them in a series of passes.
I read the description of one of the compiler passes I'm supposed to write this week. A couple of sentences go like this:
Each reference to a free variable is replaced by a call to
procedure-ref. The index is determined by the ordering of free variables in the enclosing
Bells, whistles, fireworks, giant flashing lights!
I go to my email and dig up the little five-line procedure that Ramana showed me in October. It's immediately apparent how to tweak it to do what I need it to do for my compiler. In fact, it's so general that I barely have to tweak it at all. All the rest of the code, I write from scratch. I finish the pass and run it through the tests, and it passes all 329 of them. What's more, I think it's a nice piece of code -- certainly nicer than anything I was capable of writing back in October. I'm sitting here now with these two files open -- one with the lexical-address thing I did six months ago, which I struggled with for days and had to ask for help with just to get correct, let alone anything like good -- and one with this compiler pass I wrote yesterday in two hours -- and the contrast is astounding.
But the best part is that when I read that email I wrote in October and saw that whole "if I can pick up some good habits from these people, it will be wonderful" thing, I realized that I did exactly that. I picked up a specific good habit and used it. That means I'm learning. I'm learning! I'm better at this than I was six months ago! It's so exciting. I can't wait to see what happens in another six months. I'm just getting started.
;; introduce-procedure-primitives: create closures, allocate them on ;; the heap, and introduce new primitives for accessing them. ;; The two things that we have to generate for each closure are: a ;; binding with a label and a call to make-procedure; and a set ;; (possibly empty) of procedure-set! calls. So, each call to the ;; closure helper returns these two values. ;; The Uvar helper is inspired by a suggestion made by Ramana Kumar, ;; who pointed out (when I was working on a similar problem in B521 ;; last year) that it's inefficient to call memq on a list if we're ;; also going to use an index helper to run down the list anyway. So, ;; we just write a helper to do both at once. (define-who introduce-procedure-primitives (define Uvar (lambda (x cp env) (let loop ([n 0] [env env]) (cond [(null? env) x] [(eq? (car env) x) `(procedure-ref ,cp (quote ,n))] [else (loop (+ n 1) (cdr env))])))) (define Lambda (lambda (x) (match x [(lambda (,cp ,x* ...) (bind-free (,cp ,fx* ...) ,body)) `(lambda (,cp ,x* ...) ,((Expr cp fx*) body))] [,otherwise (error who "unexpected Lambda ~s" otherwise)]))) (define closure (lambda (cp free*) (lambda (x) (match x [[,uvar ,label ,fx* ...] (let* ([fx* (map (Expr cp free*) fx*)] [length (length fx*)] [index* (iota length)] [uvar* (make-list length uvar)] [proc-set* (if (zero? length) '() `((procedure-set! ,uvar* (quote ,index*) ,fx*) ...))]) (values `[,uvar (make-procedure ,label (quote ,length))] proc-set*))] [,otherwise (error who "unexpected closure ~s" otherwise)])))) (define Expr (lambda (cp free*) (lambda (x) (match x [,uvar (guard (uvar? uvar)) (Uvar uvar cp free*)] [(quote ,imm) `(quote ,imm)] [(if ,[test] ,[conseq] ,[alt]) `(if ,test ,conseq ,alt)] [(begin ,[expr*] ... ,[expr]) (make-begin `(,expr* ... ,expr))] [(let ([,uvar* ,[expr*]] ...) ,[expr]) `(let ([,uvar* ,expr*] ...) ,expr)] [(letrec ([,label* ,[Lambda -> lambda*]] ...) (closures (,[(closure cp free*) -> binding* proc-set**] ...) ,[expr])) `(letrec ([,label* ,lambda*] ...) (let (,binding* ...) (begin ,proc-set** ... ... ,expr)))] [(,prim ,[expr*] ...) (guard (prim? prim)) `(,prim ,expr* ...)] [(,[expr] ,[expr*] ...) `((procedure-code ,expr) ,expr* ...)] [,otherwise (error who "unexpected Expr ~s" otherwise)])))) (lambda (x) ((Expr 'dummy-cp '()) x)))
- Actually, the de Bruijn index. But we don't talk about that because it's scarier-sounding than "lexical address".