Log in

No account? Create an account
Lindsey Kuper [entries|archive|friends|userinfo]
Lindsey Kuper

[ website | composition.al ]
[ userinfo | livejournal userinfo ]
[ archive | journal archive ]

Pizzas, pants, my compiler, and me [Apr. 11th, 2009|07:15 pm]
Lindsey Kuper
[Tags|, ]

When I was taking B521 last October, there was a particular programming exercise I had to do for a homework assignment. After a long struggle, I made my code work, but it was bad. Let me tell you all about how bad it was!

Suppose you have to deliver a small, a medium, and a large pizza to each house in your neighborhood. All of the pizzas -- all the small, all the medium, and all the large -- will fit in your pizza-delivery vehicle. But you start out with only the small pizzas. You stop at each house and deliver a small pizza. Then you go and get the medium pizzas, and repeat. Finally, you deliver a large pizza to each house.

Of course, the more efficient way to go about it would be to deliver the small, medium, and large pizzas to each house at the same time. After all, it doesn't matter in which order they receive the pizzas, just that they receive them. But you don't think of that!

That was one of the problems with my code. But the other problem was even more embarrassing.

Suppose you have small, medium, and large pants, and you have small, medium, and large drawers. You need to be able to find particular pants when you want them; it's kind of important for getting dressed. To that end, you've decided that all pants of a kind should be in a certain drawer together. Which drawer? It doesn't matter, of course. You simply need a piece of paper taped to the wall that says, for instance:


Of course, it might also be the case that your paper on the wall says


In fact, it may even be that that's always what the paper on the wall says. But you still have to check the piece of paper every time you go get some pants, because you don't know that! And every time you buy a new pair of pants and put it away, you painstakingly write out a new sheet of paper. It always says exactly what the old one says, but never mind that; you still have to do it!

I hope you're beginning to get an idea of how bad my code was. It kept a bunch of information it didn't need, which it then had to spend a lot of time maintaining; it did a lot of other unnecessary computation, too.

The particular problem I was trying to solve, by the way, had to do with calculating something called the lexical address1 of a variable. I am now going to explain lexical address, and what's more, you're going to like it, whether you like it or not.

Say you have a variable -- let's call it smallpants. Variables can be free or bound. We'll come back to that, but for now, unless stated otherwise, let's assume that they're free -- so, at the moment, smallpants is a free variable. You also have these things called binders, and binders are, oddly enough, things that can bind variables.

Now, variables are one kind of expression that you have. In addition to those, you also have lambda expressions. As it turns out, lambda expressions are binders. A lambda expression that binds the variable smallpants, for instance, would look like this: (lambda (smallpants) arbitrary-stuff). It means that smallpants is bound in the area indicated by arbitrary-stuff. (arbitrary-stuff can be any old expression, including another lambda expression.) But if an occurrence of smallpants ever actually shows up in arbitrary-stuff, we say that that particular occurrence of smallpants occurs bound.

Here's an example. If you've got the expression

(lambda (smallpants)

then the smallpants on the second line is a bound occurrence of smallpants, because it occurs inside a lambda expression that binds smallpants. You could also have, say,

(lambda (smallpants)
  (lambda (largepants)

Since lambda expressions are allowed inside other lambda expressions, this is perfectly legal. And once again, we've got a bound occurrence of smallpants. It's at a different level of nesting than the earlier one, but that doesn't matter; it's still a bound occurrence.

Now suppose we have, say, the expression

(lambda (largepants)

Now we don't have any bound occurrences at all. The lambda expression binds largepants, but largepants doesn't actually occur in the scope of the lambda expression that binds it. And smallpants does occur, but there's no lambda (smallpants) binding it at any level, so we say that it occurs free.

Easiest thing ever, right? Still with me? Okay. Lexical address works like this: it's a way to talk about the level at which a variable is bound. If you've got an occurrence of a variable (let's call it orangepants), then the lexical address of that occurrence is the number of inner lambda expressions that are in scope between the place where orangepants is bound by a lambda expression and the place where orangepants occurs. Another way to think of it is the number of times you have to jump over the word "lambda" to get to the occurrence of the variable.

Let's illustrate this with an example. Here's a bigger expression:

(lambda (orangepants)
  (lambda (redpants)
    (lambda (shinypants)
      (lambda (purplepants)
        (lambda (greenpants)

The orangepants at the very bottom is a bound occurrence. It has a lexical address of 4, because there are four lambdas in scope between the binding lambda (orangepants) at the top and the occurrence of orangepants at the bottom.

So, that's lexical address! And the only other thing you need to know about lexical address is that it's based on where a variable is most recently bound. So, if you've got

(lambda (orangepants)
  (lambda (redpants)
    (lambda (shinypants)
      (lambda (orangepants)
        (lambda (greenpants)

then the lexical address of orangepants is 1, because the inner lambda (orangepants) overrides the outer one. Programmers sometimes refer to this as "shadowing" or "capturing" the outer orangepants. People often do it unintentionally, leading to their code behaving in unexpected ways. Indeed, many bugs are caused by accidentally getting one's pants captured. But that's not the problem we're discussing today. My code wasn't wrong -- it was merely bad.

My program was trying to determine the lexical address of each variable in an expression. It did this by walking through the expression, keeping a running list of every variable it saw being bound by a lambda expression. After seeing orangepants bound at the top of the expression above, my program would create a list that looked like this:

((orangepants . 0))

Every time my program jumped over another lambda, it would add an entry to the list for the variable bound by that lambda, as well as updating all the other entries. So, the next time this list got updated while processing the expression above, it would look like this:

((redpants . 0) (orangepants . 1))

If it ran across a variable that was already in the list -- a variable that was being shadowed -- it would remove that variable's new entry and stick a new entry with a lexical address of 0 onto the current front of the list, as well as updating all the rest of the entries. So, by the time it finally got to the bottom of the above expression, the list would be

((greenpants . 0) (orangepants . 1) (shinypants . 2) (redpants . 3))

Of course, I wasn't examining the list myself at each stage of my program. The program itself dealt with this list, and the end result was correct -- orangepants ended up with a lexical address of 1, just like it was supposed to have.

But, as should now be glaringly apparent from looking at the list, this is quite a lot like having a piece of paper taped to your wall that says "SMALL PANTS IN SMALL DRAWER MEDIUM PANTS IN MEDIUM DRAWER LARGE PANTS IN LARGE DRAWER". As a result of the way the list was built, the lexical address of each variable ends up being its position in the list. Which means I could have had a list of

(greenpants orangepants shinypants redpants)

and it would have been easier to deal with and shorter and would have contained exactly the same information.

That's kind of slick, really. But I didn't think of it. Ramana Kumar, who happened to be grading my homework that week, did. He also thought of another optimization: instead of first checking whether a variable appears in the list at all (to determine whether the variable is free or bound) and then calculating the lexical address in the event that it's bound, you can just do both in one step, which saves you having to run down the list twice. This is kind of like making one trip instead of three trips to deliver pizzas to each of your neighbors' houses.

When I look back at my code, it really makes me cringe. Not only is it doing all kinds of unnecessary stuff, it's also long and unwieldy and just cumbersome as all hell. It's just bad, you guys. But! As I said, all of this happened early last October. At the time, after Ramana showed me how he would have done it, I wrote an email to Alex oniugnip.

From: Lindsey Kuper
To: Alex Rudnick
Date: Thu, Oct 2, 2008 at 12:07 AM
Subject: Fwd: using a count for lex

Ramana's so badass. His solution to this problem is so much more elegant than mine -- less code *and* less computation.

It surprises me how strongly I'm reacting to this. I'm used to being around people who are better programmers than I am, but I'm not used to being around people who are better programmers than I am *in my native language*. If I can pick up some good habits from these people, it will be wonderful.

So here we are, six months later, right? It's April. A Friday. I'm working on my compiler. A compiler is something that processes expressions and turns them into other expressions by doing transformations on them in a series of passes.

I read the description of one of the compiler passes I'm supposed to write this week. A couple of sentences go like this:

Each reference to a free variable is replaced by a call to procedure-ref. The index is determined by the ordering of free variables in the enclosing bind-free form.

Bells, whistles, fireworks, giant flashing lights!

I go to my email and dig up the little five-line procedure that Ramana showed me in October. It's immediately apparent how to tweak it to do what I need it to do for my compiler. In fact, it's so general that I barely have to tweak it at all. All the rest of the code, I write from scratch. I finish the pass and run it through the tests, and it passes all 329 of them. What's more, I think it's a nice piece of code -- certainly nicer than anything I was capable of writing back in October. I'm sitting here now with these two files open -- one with the lexical-address thing I did six months ago, which I struggled with for days and had to ask for help with just to get correct, let alone anything like good -- and one with this compiler pass I wrote yesterday in two hours -- and the contrast is astounding.

But the best part is that when I read that email I wrote in October and saw that whole "if I can pick up some good habits from these people, it will be wonderful" thing, I realized that I did exactly that. I picked up a specific good habit and used it. That means I'm learning. I'm learning! I'm better at this than I was six months ago! It's so exciting. I can't wait to see what happens in another six months. I'm just getting started.

;; introduce-procedure-primitives: create closures, allocate them on
;; the heap, and introduce new primitives for accessing them.

;; The two things that we have to generate for each closure are: a
;; binding with a label and a call to make-procedure; and a set
;; (possibly empty) of procedure-set! calls.  So, each call to the
;; closure helper returns these two values.

;; The Uvar helper is inspired by a suggestion made by Ramana Kumar,
;; who pointed out (when I was working on a similar problem in B521
;; last year) that it's inefficient to call memq on a list if we're
;; also going to use an index helper to run down the list anyway.  So,
;; we just write a helper to do both at once.

(define-who introduce-procedure-primitives
 (define Uvar
  (lambda (x cp env)
    (let loop ([n 0]
               [env env])
      (cond [(null? env) x]
            [(eq? (car env) x) `(procedure-ref ,cp (quote ,n))]
            [else (loop (+ n 1) (cdr env))]))))
 (define Lambda
  (lambda (x)
    (match x
      [(lambda (,cp ,x* ...)
         (bind-free (,cp ,fx* ...)
       `(lambda (,cp ,x* ...)
          ,((Expr cp fx*) body))]
      [,otherwise (error who "unexpected Lambda ~s" otherwise)])))
 (define closure
  (lambda (cp free*)
    (lambda (x)
      (match x
        [[,uvar ,label ,fx* ...]
         (let* ([fx* (map (Expr cp free*) fx*)]
                [length (length fx*)]
                [index* (iota length)]
                [uvar* (make-list length uvar)]
                 (if (zero? length)
                     `((procedure-set! ,uvar* (quote ,index*) ,fx*) ...))])
            `[,uvar (make-procedure ,label (quote ,length))]
        [,otherwise (error who "unexpected closure ~s" otherwise)]))))
 (define Expr
  (lambda (cp free*)
    (lambda (x)
      (match x
        [,uvar (guard (uvar? uvar))
               (Uvar uvar cp free*)]
        [(quote ,imm) `(quote ,imm)]
        [(if ,[test] ,[conseq] ,[alt])
         `(if ,test ,conseq ,alt)]
        [(begin ,[expr*] ... ,[expr])
         (make-begin `(,expr* ... ,expr))]
        [(let ([,uvar* ,[expr*]] ...) ,[expr])
         `(let ([,uvar* ,expr*] ...) ,expr)]
        [(letrec ([,label* ,[Lambda -> lambda*]] ...)
           (closures (,[(closure cp free*) -> binding* proc-set**] ...)
         `(letrec ([,label* ,lambda*] ...)
            (let (,binding* ...)
                ,proc-set** ... ...
        [(,prim ,[expr*] ...)
         (guard (prim? prim))
         `(,prim ,expr* ...)]
        [(,[expr] ,[expr*] ...)
         `((procedure-code ,expr) ,expr* ...)]
        [,otherwise (error who "unexpected Expr ~s" otherwise)]))))
 (lambda (x) ((Expr 'dummy-cp '()) x)))

  1. Actually, the de Bruijn index. But we don't talk about that because it's scarier-sounding than "lexical address".

[User Picture]From: stereotype441
2009-04-12 06:18 am (UTC)

pants pants pants

I am now going to explain lexical address, and what's more, you're going to like it, whether you like it or not.

I do not like lexical addresses, Sam I Am. But I do like pants.

As a result of the way the list was built, the lexical address of each variable ends up being its position in the list.

I don't know if it matters, but aren't there exceptions to this when you remove entries from the list because variables are shadowed? For example, if I'm understanding correctly, this expression
(lambda (hotpants)
  (lambda (nopants)
    (lambda (lisztofpants)
      (lambda (nopants)
Should produce the resulting list
((nopants . 0) (lisztofpants . 1) (hotpants . 3))

Anyway, this was really fascinating, Lindsey. And I'm really excited to hear how much cool stuff you are learning. Thanks for sharing it :)
(Reply) (Thread)
[User Picture]From: lindseykuper
2009-04-12 01:41 pm (UTC)

Re: pants pants pants

True! But in that case, the new style of list would be

(nopants lisztofpants nopants hotpants)

because in the new algorithm, I no longer bother to search the list for old entries and remove them if they match the one I'm sticking on the front. So there are repeats -- but if I want to look up someone's lexical address, I just look for its first occurrence in the list.

By the way, this only works if you're binding exactly one variable with each binding form. But I had a guarantee of that.
(Reply) (Parent) (Thread)
From: boojum
2009-04-14 03:07 pm (UTC)

Re: pants pants pants

For multiple variable binding, could you do something like:
(lambda (a) (lambda (b c) (stuff)))
turns into (a (b c)), with a more complicated search function?

(It seems like every post of yours makes me wist after more school. Theoretical learning and problem solving is so nifty.)
(Reply) (Parent) (Thread)
[User Picture]From: lindseykuper
2009-04-24 12:07 am (UTC)

I didn't know 'wist' was a word.

Yeah, you could probably do that. Or you could just autocurry everything first! Whee!
(Reply) (Parent) (Thread)
[User Picture]From: pixelherder
2009-04-13 04:37 am (UTC)
If I can pick up some good habits from these people, it will be wonderful.

I've been re-reading Jon Bentley's Programming Pearls and More Programming Pearls. They're old books -- the material in them is almost 30 years old now. But for exactly that reason, they provide some neat examples of these sorts of tricks. They're from when people really did have to be clever about their algorithms in order to squeeze more out of limited memory and computational resources. (Column 8 of the first book still amazes me.)

Anyway, it might be worth checking your university library for a copy of them. They're both slim books and easy to dip into.

(Reply) (Thread)
[User Picture]From: lindseykuper
2009-04-13 07:00 am (UTC)
Ooh, yeah -- Programming Pearls has been on my list for a long time.

It's funny how going to school changes one's values. I used to be reluctant to dig into that book because it sounded like work. Now, it's sounding like a nice break.
(Reply) (Parent) (Thread)
[User Picture]From: jes5199
2009-04-14 03:06 am (UTC)
I'd like to report that these aren't total Scheme-isms. I've seen code that calls Array#includes? and immediately follows it with Array#find. I call that style superstitious coding, it's as if the programmer doesn't trust the second call to fail correctly in the negative case. (Seriously, though, "memq" ? Were scheme's designers competing in the Obfuscated C Contest?)

Every time you talk about "free variables" i get a little crosseyed though. When is a free variable not an undefined variable?

but speaking of inefficiencies, the the GNU Linear Programming Kit has a section of code that makes a deep copy of an immutable data structure for every iteration. When you do a simple map, it squares the allocated storage for the process. That means, for my data set, I can't even run the damn thing on a machine that has less than 6 GIGS of RAM.
(Reply) (Thread)
[User Picture]From: lindseykuper
2009-04-14 04:40 pm (UTC)
Well, when we say that something is free or bound, we mean that it's free or bound in a particular scope. If the scope is obvious from, uh, context, then we don't mention it, but it's still there. An expression with no free variables is called "closed". And a closed lambda-calculus expression is more commonly known by its other name, "computer program".

"memq" is short for "member eq?" There's also memv and member for when you want to use eqv? and equal? as the equivalence test, respectively. The really subtle part is understanding when to use eq?, eqv?, equal?, or =.
(Reply) (Parent) (Thread)