Log in

No account? Create an account
Invertible parsing and pretty-printing - Lindsey Kuper [entries|archive|friends|userinfo]
Lindsey Kuper

[ website | composition.al ]
[ userinfo | livejournal userinfo ]
[ archive | journal archive ]

Invertible parsing and pretty-printing [Feb. 1st, 2011|09:38 pm]
Lindsey Kuper

At my job last summer, I'd occasionally hear people say that pretty-printing is the opposite of parsing. They were probably referring specifically to the opposite of parsing-assembly-instructions-to-ASTs, but they could have meant parsing in the more general sense of going from concrete syntax to abstract syntax.

I'd forgotten about the opposite-of-parsing thing until today, when I noticed this paper that just turned up on Lambda the Ultimate. The paper points out that parsing and pretty-printing aren't quite inverses, because several concrete representations can map onto the same abstract representation, but you typically don't want your pretty-printer to spit out all the possible concrete representations, so you have to pick one (the "nicest", whatever that means).1 But the paper's clever idea is that even though they aren't inverses, we don't have to throw up our hands in defeat and implement a language's parser and pretty-printer entirely separately; instead, we can just implement one thing -- which the paper calls a "syntax description" -- and then put a different interface on it to get a parser or pretty-printer as desired.

For the world's five logic programmers, who would read that and say, "Oh, yeah, sure, I do that sometimes; I call it programming," the authors get in a special dig; in section 4.3, they write, "in contrast to logic programming[, our model of] invertibility can actually be made to work in practice." Burn!

  1. I guess this would mean that the parser and pretty-printer for a language together comprise a Galois connection. As so often. But, as so often, I'm not sure what knowing that something is a Galois connection is good for.

[User Picture]From: simrob
2011-02-02 02:48 am (UTC)
Ha ha ha ha (*sob*)
(Reply) (Thread)
[User Picture]From: lindseykuper
2011-02-02 03:04 am (UTC)
There, there. They're ignoring the fact that not every logic programming system uses plain ol' DFS, and not every logic program is impure. I bet there are fully half a dozen that aren't!

(Reply) (Parent) (Thread)
[User Picture]From: gwillen
2011-02-02 04:13 am (UTC)
I am kind of curious now how minikanren works for that hack. The way it looks to me is like you are defining something that looks a lot like a predicate, and then applying some magic which allows you to solve for one argument of the predicate given the other. But I assume your definition is actually doing something clever so that the magic doesn't have to do any searching, but can sort of unwind the one structure and pattern-match it into the other structure as it descends the recursion....
(Reply) (Thread)
[User Picture]From: gwillen
2011-02-02 04:14 am (UTC)
(To be clear, I have never seen minikanren before; when I say "for that hack" all I mean is "I am not asking you to explain the whole language, just to give me some idea what the subset of the language you are using is doing.")
(Reply) (Parent) (Thread)
[User Picture]From: lindseykuper
2011-02-02 04:49 pm (UTC)
So, I'm not exactly sure what your question is; it almost sounds like you're asking me to explain unification in general, or maybe pattern-matching in general. But there are much better sources than me for understanding those things, so instead I'll just try giving a high-level overview of what happens with my code when we run the goal (zipo '((1 3 5) (2 4 6)) q), where q is a fresh logic variable.

In the first clause of the matche expression, we try to unify the list (() ()) with ((1 3 5) (2 4 6)), which fails, so we fall through to the next clause. There we try to unify ((X . Xs) (Y . Ys)) with ((1 3 5) (2 4 6)), which works fine since X, Xs, Y, and Ys are all fresh logic variables. In particular, we unify 1 with X, (3 5) with Xs, 2 with Y, and (4 6) with Ys.

We also unify q with ((X Y) . XYs), which, since we've already got associations for X and Y, means that we're actually unifying q with ((1 2) . XYs) where the variable XYs is still fresh. Finally, we make a recursive call to zipo with the arguments ((3 5) (4 6)) and XYs. Lather, rinse, repeat. Eventually we reach the base case and zipo's second argument gets unified with ().

Again, I'm not sure if this was really the explanation you were looking for. If you're familiar with, say, Prolog, none of this should be very surprising. (For comparison, Chris did it in Prolog here.)
(Reply) (Parent) (Thread)
[User Picture]From: gwillen
2011-02-02 09:48 pm (UTC)
I am slightly but not extensively familiar with prolog. Really, that it is doing unification under the covers is enough to satisfy my curiosity. It just looks, syntactically, a lot like a predicate function written in ordinary scheme, so I was wondering how much magic was being applied and where.
(Reply) (Parent) (Thread)
(Deleted comment)
From: (Anonymous)
2011-02-02 08:12 pm (UTC)


Hahaha. But that's not nice -- logic programming is an easy target.
(Reply) (Thread)
[User Picture]From: lindseykuper
2011-02-04 06:52 am (UTC)

Re: Carlo

Oh gawd, I just felt strongly enough about this to leave my first-ever comment on LtU, and the comment is along the lines of "I am optimistic about logic programming!"

This could be the beginning of a long and ultimately disappointing journey!
(Reply) (Parent) (Thread)