No account? Create an account
 Equivalent unzippers - Lindsey Kuper [entries|archive|friends|userinfo]
Lindsey Kuper

 [ website | composition.al ] [ userinfo | livejournal userinfo ] [ archive | journal archive ]

Equivalent unzippers [Oct. 21st, 2010|11:50 pm]
Lindsey Kuper
 [ Tags | programming ]

In an effort to become more familiar with Coq, I've been working my way through the Software Foundations book. So far, it's been more about proving things with Coq than it's been about programming with it (although the distinction is blurry). Yesterday, though, I got to this problem. As the text explains, `combine`, better known as `zip`, is a function that takes two lists and combines them into a list of pairs. (For instance, given `[0, 1, 3, 18]` and `[true, false, false, true]`, it takes those two lists and "zips" them together, producing `[(0, true), (1, false), (3, false), (18, true)]`.)

The problem asks you to do the inverse: given the list of pairs, unzip it to produce the pair of lists. At first, the only way I could think of to do it was with `map`:

```Fixpoint map {X Y: Type} (fn : X -> Y) (lx : list X) : list Y :=
match lx with
| [] => []
| x::tx => fn x :: map fn tx
end.

Fixpoint split {X Y : Type} (lxy : list (X*Y))
: (list X * list Y) :=
(map fst lxy, map snd lxy).```

Later, I realized that I could do it in one pass provided I have a couple of accumulator arguments:

```Fixpoint split' {X Y : Type} (lxy : list (X*Y))
: (list X * list Y) :=
let fix f (res1 : list X) (res2 : list Y) (lxy : list (X*Y)) :=
match lxy with
| [] => (res1, res2)
| xy::txy => f (res1 ++ [(fst xy)]) (res2 ++ [(snd xy)]) txy
end
in f [] [] lxy.```

So, that was fun. But the larger point is that since I'm doing it in Coq, pretty soon I'll be able to do a machine-assisted proof that `split` and `split'` are equivalent, not to mention a proof that `split` (or `split'`) actually is the inverse of `combine`. That will be even better.

(Deleted comment)
 From: 2010-10-22 01:58 pm (UTC) (Link)

Because Chris showed-and-telled, I wrote it in Agda as well. Most of it is boilerplate because I 1) perversely insisted on using my own library and 2) I lazily haven't added lists to that library yet.

```module Kuper where

{- Bolierplate -}

open import Prelude

data List (A : Set) : Set where
[] : List A
_::_ : A → List A → List A

[_] : ∀{A} → A → List A
[ x ] = x :: []

_++_ : ∀{A} → List A -> List A -> List A
[] ++ ys = ys
(x :: xs) ++ ys = x :: (xs ++ ys)

append-cong2 : ∀{A} {a : A} {as bs : List A}
→ as ≡ bs
→ (a :: as) ≡ (a :: bs)
append-cong2 Refl = Refl

append-nil : ∀{A} {as : List A} → as ++ [] ≡ as
append-nil {as = []} = Refl
append-nil {as = a :: as} = append-cong2 (append-nil {as = as})

assoc-append : ∀{A} {as bs cs : List A}
→ (as ++ (bs ++ cs)) ≡ ((as ++ bs) ++ cs)
assoc-append {as = []} = Refl
assoc-append {as = a :: as} = append-cong2 (assoc-append {as = as})

map : ∀{A B} → (A → B) → List A → List B
map f [] = []
map f (x :: xs) = f x :: map f xs

{- Functions -}

split : ∀{A B} → List (A × B) → List A × List B
split lxy = (map fst lxy , map snd lxy)

split' : ∀{A B} → List (A × B) → List A → List B → List A × List B
split' [] xs ys = (xs , ys)
split' ((x , y) :: lxy) xs ys = split' lxy (xs ++ [ x ]) (ys ++ [ y ])

{- Equivalence -}

split-eq-lem : ∀{A B} (lx : List A) (ly : List B) (lxy : List (A × B))
→ (lx ++ map fst lxy , ly ++ map snd lxy) ≡ split' lxy lx ly
split-eq-lem lx ly [] = Product.pair-cong append-nil append-nil
split-eq-lem lx ly ((x , y) :: lxy) =
Product.pair-cong (assoc-append {as = lx}) (assoc-append {as = ly})
≡≡ split-eq-lem (lx ++ [ x ]) (ly ++ [ y ]) lxy

split-eq : ∀{A B} (lxy : List (A × B)) → split lxy ≡ split' lxy [] []
split-eq lxy = split-eq-lem [] [] lxy
```

Question: the most natural way to express this function, in Agda, was neither of your options. Is the following definition naturally expressible in Coq? This comparison example has taught me at least as much about Coq as I think I knew before, which is to say I know very little.

```  splitA : ∀{A B} → List (A × B) → List A × List B
splitA [] = ([] , [])
splitA ((x , y) :: lxy) = (x :: fst lxly) , (y :: snd lxly)
where lxly = splitA lxy
```
 From: 2010-10-22 02:24 pm (UTC) (Link)
Yikes. That program is a pretty strong argument for tactical theorem proving! (As an aside: your natural definition works, but makes you rewrite with the extensionality of pairs to deal with let-bindings showing up in mildly annoying fashion.)
```Set Implicit Arguments.
Unset Strict Implicit.
Import Prenex Implicits.
Require Import List.

Fixpoint split (X Y : Type) (xys : list (X * Y)) :=
match xys with
| nil => (nil, nil)
| cons (x,y) xys' =>
let (xs', ys') := split xys' in (cons x xs', cons y ys')
end.

Fixpoint zip (X Y : Type) (xs : list X) (ys : list Y) {struct xs} :=
match xs, ys with
| nil, _ => nil
| _, nil => nil
| (cons x xs'), (cons y ys') => cons (x,y) (zip xs' ys')
end.

Lemma pair_ext : forall (A B : Type) (a : A * B), a = (fst a, snd a).
Proof.
intros A B [u v]; simpl; auto. (* Why isn't this in the stdlib??? *)
Qed.

Lemma zipsplit : forall (X Y : Type) (xys : list (X * Y)),
let (xs, ys) := split xys in zip xs ys = xys.
Proof.
intros X Y; simpl; induction xys; [simpl; auto | unfold split; fold split].
rewrite (pair_ext a); rewrite (pair_ext (split xys)) in IHxys |- *.
simpl; rewrite IHxys; reflexivity.
Qed.
```
 From: 2010-10-22 02:37 pm (UTC) (Link)
Oh, I proved the wrong lemma. Here's the equivalence of split and split', using the definition of split above:
```Definition split' (X Y : Type) (xys : list (X * Y)) := (map (@fst _ _) xys, map (@snd _ _) xys).

Lemma split_split' : forall (X Y : Type) (xys : list (X * Y)), split xys = split' xys.
Proof.
intros X Y; induction xys; auto; simpl.
rewrite IHxys; rewrite (pair_ext a); rewrite (pair_ext (split' xys)); reflexivity.
Qed.
```
 From: 2010-10-22 05:29 pm (UTC) (Link)
The boilerplate or the proof-after-boilerplate is a strong argument for tactical theorem proving? While I realize Product.pair-cong (assoc-append {as = lx}) (assoc-append {as = ly}) ≡≡ split-eq-lem (lx ++ [ x ]) (ly ++ [ y ]) lxy and intros X Y; induction xys; auto; simpl. rewrite IHxys; rewrite (pair_ext a); rewrite (pair_ext (split' xys)); reflexivity. are probably equally meaningless to someone unfamiliar with the relevant tools, I didn't find it at all inconvenient to write the non-boilerplate part, and now the boilerplate part is in my standard library where it belongs.
 From: 2010-10-24 10:34 am (UTC) (Link)
It's not the way the proof looks, since all proof assistants require crazy moon language (except for Isar, apparently). It's the raw size of the proof! That's too big.

I don't think I really believe he stdlib argument. In a dependently-typed language, if you have two libraries with N and M methods, then in general you'll need O(M x N) lemmas describing their interaction. (This is why even my Coq proof feels a bit too complicated to me -- these proofs need to be simple enough that machines can reliably cook them up for us.)
 From: 2010-10-24 04:00 pm (UTC) (Link)
Well, I actually don't see that being true, at least not in this case. In Agda, I'm annoyed to have to write proofs about the properties of algebraic structures, but not their interactions.

Basically every nitpicky lemma in my nascent standard library, in Dan's standard library that it's based upon, and literally everything that I wrote in boilerplate that's not map expresses one of the two things
• A constructor respects equivalence (append-cong2) and is injective.
• Something is a monoid (assoc-append, append-unit).
Mature-ish libraries (such as the Agda Stdlib) both have something like a ring solver that lets you automate most of these operations.
(Deleted comment)
 From: 2010-10-22 03:24 pm (UTC) (Link)
Question: the most natural way to express this function, in Agda, was neither of your options. Is the following definition naturally expressible in Coq?

I would hesitate to say that either of my options were the most natural way to express it in Coq, either. Well, the first one isn't bad. But the "most natural way" probably involves, first of all, letting the type inferencer do more work, and second, relying on libraries more. So, more like Neel did it, probably.
 From: 2010-10-22 09:25 pm (UTC) (Link)
Dan pointed out that if I want to prove equivalence of split and splitA the proof is actually very, very nice. I needed an annoying lemma whose proof is Refl = refl in this setup, but the proof is dead simple if that is factored out.

```  cong-lem : ∀{A B} {x : A} {y : B} {xs1 xs2 : List A} {ys1 ys2 : List B}
→ (xs1 , ys1) ≡ (xs2 , ys2)
→ IdT {A = List A × List B} (x :: xs1 , y :: ys1) (x :: xs2 , y :: ys2)
cong-lem Refl = refl

split-eqA : ∀{A B} → (lxy : List (A × B)) → split lxy ≡ splitA lxy
split-eqA [] = refl
split-eqA ((x , y) :: lxy) = cong-lem (split-eqA lxy)
```
 From: 2010-10-22 09:58 pm (UTC) (Link)
It's cheating when you control the standard library, but I can get rid of cong-lem with a standard subst-equals-for-equals function resp:
```  split-eqA : ∀{A B} → (lxy : List (A × B)) → split lxy ≡ splitA lxy
split-eqA [] = refl
split-eqA ((x , y) :: lxy) =
resp (λ p → x :: fst p , y :: snd p)  (split-eqA lxy)```

This whole thread increasingly feels wonderfully like the opposite of a flamewar. Water gun fight?
 From: 2010-10-22 10:53 pm (UTC) (Link)
I think it is even more like politely offering a glass of water to a visiting guest.

Oh, but we also have milk! Or would you like a beer? Soda? Diet? Tea? Coffee? Tactical theorem-prover juice?
(Deleted comment)
 From: 2010-10-22 02:56 pm (UTC) (Link)

Ooh, me next, me next!

Here it is in miniKanren:

```(import (minikanren vanilla))

(define zipo
(lambda (pairoflists listofpairs)
(matche `(,pairoflists ,listofpairs)
[( (() ()) () )]
[( ((,X . ,Xs) (,Y . ,Ys)) ((,X ,Y) . ,XYs) )
(zipo `(,Xs ,Ys) XYs)])))

> (run* (q) (zipo '((1 3 5) (2 4 6)) q))
(((1 2) (3 4) (5 6)))
> (run* (q) (zipo q '((1 2) (3 4) (5 6))))
(((1 3 5) (2 4 6)))
```

 From: 2010-10-22 06:48 pm (UTC) (Link)

Meanwhile I've taken a try at it in Isabelle/HOL using Isar. I make no claim on how "native" this is, given my relative lack of experience in the system.

```theory Kuper
imports Main

begin

fun ksplit :: "('a * 'b) list ⇒ 'a list * 'b list"
where "ksplit ABs = (map fst ABs, map snd ABs)"

fun ksplit' :: "('a * 'b) list ⇒ 'a list * 'b list"
where
"ksplit' [] = ([], [])" |
"ksplit' ((a, b) # ABs) =
(case ksplit' ABs of
(As, Bs) ⇒ (a # As, b # Bs))"

theorem ksplit_ksplit' : "ksplit l = ksplit' l"
proof (induct l rule: ksplit'.induct)
case 1 thus ?case by simp
next
case (2 a b ABs)
assume H : "ksplit ABs = ksplit' ABs"
then have "ksplit' ABs = (map fst ABs, map snd ABs)" by simp
then have "ksplit' ((a, b) # ABs) = (a # map fst ABs, b # map snd ABs)" by simp
thus ?case by simp
qed

end```
 From: 2010-10-22 06:57 pm (UTC) (Link)

Experimenting with the proof script, it turns out I can leave out the second "then have" clause; just leading it to the equivalence of ksplit' and the expanded ksplit is enough for simplification.

Edit: here's the simplest version I've found for that so far:

```theorem ksplit_ksplit' : "ksplit l = ksplit' l"
proof (induct rule: ksplit'.induct)
case (2 a b ABs)
assume "ksplit ABs = ksplit' ABs"
hence "... = (map fst ABs, map snd ABs)" by simp
thus ?case by simp
qed (simp)```

The simp on the qed gets rid of the trivial case, and Sam helped clean up the other case a bit. (Can you tell we're bored waiting for our flight this evening out of Reno?)

Edited at 2010-10-22 07:05 pm (UTC)
 From: 2010-10-22 07:45 pm (UTC) (Link)
Ooh, I like the looks of the Isabelle/HOL proof. Do you have to actually write `next` when you move on to the next goal, or is that just there for readability?
 From: 2010-10-22 07:50 pm (UTC) (Link)
Readability; the Isar syntax is geared towards writing human-readable forward proofs. If you drop the `next`, nothing changes (just tested to make sure).
 From: 2010-10-22 10:42 pm (UTC) (Link)
Sam said earlier that `next` has some semantic meaning, but he didn't clarify to me and didn't post a correction here as I requested, so keep that in mind.
 From: 2010-10-22 07:20 pm (UTC) (Link)

Once more, in ACL2!

```(defun mymapcar (l)
(if (endp l) nil
(cons (car (car l)) (mymapcar (cdr l)))))

(defun mymapcdr (l)
(if (endp l) nil
(cons (cdr (car l)) (mymapcdr (cdr l)))))

(defun split1 (l)
(cons (mymapcar l) (mymapcdr l)))

(defun split2 (l)
(if (endp l) (cons nil nil)
(let ((ls (split2 (cdr l))))
(cons (cons (car (car l)) (car ls))
(cons (cdr (car l)) (cdr ls))))))

(defthm split1-split2
(equal (split1 l) (split2 l)))
```

Notes:

• ACL2 is first-order, so no ability to use map.
• Though mapcar/mapcdr exist in Lisp, they're not defined in ACL2, which means I can't even define them by the usual names because it conflicts with the underlying Lisp. (They really should be defined in ACL2 though, boo.)

And with that, I'll stop spamming this post.

Edit: Except for what I got after talking to my local ACL2 expert:

```(defun split1 (l)
(cons (strip-cars l) (strip-cdrs l)))

(defun split2 (l)
(if (endp l) (cons nil nil)
(let ((ls (split2 (cdr l))))
(cons (cons (car (car l)) (car ls))
(cons (cdr (car l)) (cdr ls))))))

(defthm split1-split2
(equal (split1 l) (split2 l)))```

So "mapcar" is "strip-cars" and "mapcdr" "strip-cdrs"? I am so sending a look of disapproval in the direction of the ACL2 people.

Edited at 2010-10-22 07:24 pm (UTC)
 From: 2010-10-22 07:43 pm (UTC) (Link)
In `split1-split2`, is the `l` globally quantified? As someone who knows nothing about ACL2, how would you go about proving that theorem in ACL2? Would it be more difficult than using Coq, etc.?
 From: 2010-10-22 07:47 pm (UTC) (Link)
You don't prove it; you let ACL2 prove it, or fail trying. In the latter case, you try and find the lemma that'll lead ACL2 to the right place and have it attempt that, or realize there's a mistake in your theorem. ACL2 is an automated theorem prover, not a proof assistant.

As for the `l` in the theorem, ACL2 assumes that unbound variables in theorems are universally quantified (i.e. that's equivalent to "for all l, ...").
 From: 2010-10-22 07:57 pm (UTC) (Link)
Oh, okay! So presumably it was able to prove it? Being from Coq-land I'm sort of proof-term-or-it-didn't-happen. (On the other hand, just because I write `Qed.` at the end of some Coq code and post it on the Internet doesn't mean that Coq actually accepted the proof.)
 From: 2010-10-22 08:02 pm (UTC) (Link)
Yeah, I wouldn't have posted it here if it'd failed :)

It didn't need any extra lemmas to push the proof through, as this was a pretty simple proof for it. I already included the proof transcript in the previous comment I left.
(Deleted comment)
 From: 2010-10-23 09:16 pm (UTC) (Link)
Maybe I'll design a shirt.
 From: 2010-10-22 07:55 pm (UTC) (Link)

Here's the interactions I had with ACL2 at the REPL, which will give you an idea of what that looks like (well, just the theorem proving part), since the rest puts it over the comment limit):

```ACL2 !>
(defthm split1-split2
(equal (split1 l) (split2 l)))

ACL2 Warning [Non-rec] in ( DEFTHM SPLIT1-SPLIT2 ...):  A :REWRITE
rule generated from SPLIT1-SPLIT2 will be triggered only by terms containing
the non-recursive function symbol SPLIT1.  Unless this function is
disabled, this rule is unlikely ever to be used.

This simplifies, using the :definition SPLIT1, to

Goal'
(EQUAL (CONS (STRIP-CARS L) (STRIP-CDRS L))
(SPLIT2 L)).

Name the formula above *1.

Perhaps we can prove *1 by induction.  Three induction schemes are
suggested by this conjecture.  Subsumption reduces that number to one.

We will induct according to a scheme suggested by (SPLIT2 L).  This
suggestion was produced using the :induction rules SPLIT2, STRIP-CARS
and STRIP-CDRS.  If we let (:P L) denote *1 above then the induction
scheme we'll use is
(AND (IMPLIES (AND (NOT (ENDP L)) (:P (CDR L)))
(:P L))
(IMPLIES (ENDP L) (:P L))).
This induction is justified by the same argument used to admit SPLIT2.
When applied to the goal at hand the above induction scheme produces
two nontautological subgoals.

Subgoal *1/2
(IMPLIES (AND (NOT (ENDP L))
(EQUAL (CONS (STRIP-CARS (CDR L))
(STRIP-CDRS (CDR L)))
(SPLIT2 (CDR L))))
(EQUAL (CONS (STRIP-CARS L) (STRIP-CDRS L))
(SPLIT2 L))).

By the simple :definition ENDP we reduce the conjecture to

Subgoal *1/2'
(IMPLIES (AND (CONSP L)
(EQUAL (CONS (STRIP-CARS (CDR L))
(STRIP-CDRS (CDR L)))
(SPLIT2 (CDR L))))
(EQUAL (CONS (STRIP-CARS L) (STRIP-CDRS L))
(SPLIT2 L))).

This simplifies, using the :definitions SPLIT2, STRIP-CARS and STRIP-CDRS,
primitive type reasoning and the :rewrite rule CONS-EQUAL, to the following
two conjectures.

Subgoal *1/2.2
(IMPLIES (AND (CONSP L)
(EQUAL (CONS (STRIP-CARS (CDR L))
(STRIP-CDRS (CDR L)))
(SPLIT2 (CDR L))))
(EQUAL (STRIP-CARS (CDR L))
(CAR (SPLIT2 (CDR L))))).

But simplification reduces this to T, using primitive type reasoning,
the :rewrite rule CAR-CONS and the :type-prescription rules SPLIT2
and STRIP-CDRS.

Subgoal *1/2.1
(IMPLIES (AND (CONSP L)
(EQUAL (CONS (STRIP-CARS (CDR L))
(STRIP-CDRS (CDR L)))
(SPLIT2 (CDR L))))
(EQUAL (STRIP-CDRS (CDR L))
(CDR (SPLIT2 (CDR L))))).

But simplification reduces this to T, using primitive type reasoning
and the :rewrite rule CDR-CONS.

Subgoal *1/1
(IMPLIES (ENDP L)
(EQUAL (CONS (STRIP-CARS L) (STRIP-CDRS L))
(SPLIT2 L))).

By the simple :definition ENDP we reduce the conjecture to

Subgoal *1/1'
(IMPLIES (NOT (CONSP L))
(EQUAL (CONS (STRIP-CARS L) (STRIP-CDRS L))
(SPLIT2 L))).

But simplification reduces this to T, using the :definitions SPLIT2,
STRIP-CARS and STRIP-CDRS and the :executable-counterparts of CONS
and EQUAL.

That completes the proof of *1.

Q.E.D.

Summary
Form:  ( DEFTHM SPLIT1-SPLIT2 ...)
Rules: ((:DEFINITION ENDP)
(:DEFINITION NOT)
(:DEFINITION SPLIT1)
(:DEFINITION SPLIT2)
(:DEFINITION STRIP-CARS)
(:DEFINITION STRIP-CDRS)
(:EXECUTABLE-COUNTERPART CONS)
(:EXECUTABLE-COUNTERPART EQUAL)
(:FAKE-RUNE-FOR-TYPE-SET NIL)
(:INDUCTION SPLIT2)
(:INDUCTION STRIP-CARS)
(:INDUCTION STRIP-CDRS)
(:REWRITE CAR-CONS)
(:REWRITE CDR-CONS)
(:REWRITE CONS-EQUAL)
(:TYPE-PRESCRIPTION SPLIT2)
(:TYPE-PRESCRIPTION STRIP-CDRS))
Warnings:  Non-rec
Time:  0.00 seconds (prove: 0.00, print: 0.00, other: 0.00)
SPLIT1-SPLIT2
ACL2 !>
```
 From: 2010-10-22 06:02 am (UTC) (Link)
Do you actually get to prove that two coq functions are equivalent? How do you take apart the construction of split and split' to reason about them? I didn't think you could deal with meta level functions like that, but I've never really used Coq.
 From: 2010-10-22 03:12 pm (UTC) (Link)
Yeah, you do. How you "take apart" the definitions depends, but it's a matter of showing that given the same arguments they produce the same results. Look at how Neel did it up above. His `split'` (the one that uses `map`) is basically my `split`, while his `split` is sort of like my `split'` but prettier. Mine might be harder to prove stuff about -- I'm not really sure yet.
 From: 2010-10-22 04:55 pm (UTC) (Link)
Oh that is crazy, though neel's proof is obviously in crazy moon language.
 From: 2010-10-22 06:51 pm (UTC) (Link)
Oh, yeah, well, basically this whole post and all the comments and arguably my entire academic field is a my-approach-is-more-crazy-than-yours arms race.
 From: 2010-10-23 04:55 am (UTC) (Link)
the key word is "extensionality"
(Deleted comment)
 From: 2010-10-22 06:04 pm (UTC) (Link)
The real problem is that zip() is of arity 2, and not arity n. An n-arity zip() is its own inverse! :)
(Deleted comment)
 From: 2010-10-23 12:45 am (UTC) (Link)
True! I was thinking in Python where I was extremely pleased to figure out that
zip(*zip(stuff1, stuff2, stuff3)) == stuff1, stuff2, stuff3
(Deleted comment)
 From: 2010-10-22 11:35 pm (UTC) (Link)
Eh, I posted the ACL2 version, and given the lack of higher-order functions, it's even less interesting (especially if you had to define your own `mapcar` and `mapcdr`).
From:
2010-10-26 06:52 pm (UTC)

### Proof language for non-PL-researchers?

Hey Lindsey et al!
If I wanted to learn one of these wild proof languages, say for doing geometric/computer-graphics proofs, what language would you recommend? I started working through a tutorial for that Omega language, which I understand is rather different from what the folks in this thread are using... but the documentation made it seem to be a better choice for a beginner.

If I'm concerned mostly with ease of use (and libraries for dealing with real numbers, I guess) what should I use?
From:
2010-10-26 08:04 pm (UTC)

### Re: Proof language for non-PL-researchers?

Hm. Have you looked at GeoProof? Apparently it talks to Coq.

In any case, I'd recommend whichever one has the best library for geometric proofs, since if it doesn't already exist you'll be spending a lot of time building up scaffolding before you can even get started.

To my knowledge, Omega's less a proof assistant than it is a general-purpose language with dependent types. Having said that, the line between the two is blurry, and I think a lot of people like Agda precisely because it straddles that line pretty well.