Last week, I wrote down some stuff sketching out what I'm working on and what I want to accomplish with my research project this semester. Herewith:

# Program equivalence in multi-language systems

## Contextual equivalence

We often care about proving that two programs are contextually
equivalent -- meaning that those programs, when dropped into the
same context, have the same observable behavior. (A context can be
thought of as simply a program with a "hole" in it; we get a
runnable program by filling the hole. For instance, "if [.] then 5
else 500" is a context.) Contextual equivalence is important for,
for instance, proving the correctness of compiler transformations
(we want to show that the program being operated on has the same
behavior as it would have had before a particular compiler pass
runs), or for proving that we can refactor a component of a software
system and drop in the refactored component as a safe replacement
for the old one.

If programs e1 and e2 are contextually equivalent, it means that if
we drop e1 and e2 into the same context C, the resulting two
expressions will evaluate to the same result. For the context "if
([.] > 0) then 5 else 500", e1 and e2 can be different; as long as
they're both greater than 0, they'll produce the same result. For
our "if [.] then 5 else 500" example, though, e1 and e2 both have to
be either "true" or "false". Contextual equivalence is extremely
difficult to prove, because it requires us to quantify over all
program contexts: "For all contexts C, program C[e1] and C[e2] are
equivalent." (We write C[e] to mean a context C plugged with
expression e.)

## Logical equivalence

Because of the difficulty of proving contextual equivalence, we
choose to instead define a different notion of equivalence of
programs, which we call logical equivalence.

Logical equivalence of expressions is defined inductively on the
structure of their types. Here are a few examples:

- As a base case, we say that two expressions e and e' of type Nat
are logically equivalent if they're equal. We write e ~ e' :
Nat, which is read "e is logically equivalent to e' at the type
Nat". Since logical equivalence is a relation, sometimes we
just say "e is related to e' at the type Nat".
- The pairs (e1, e2) and (e1', e2'), each of type Nat * Nat, are
related at the type Nat * Nat if their first and second
components, respectively, are related at the type Nat.
- For the functions (lambda x : T1.e) and (lambda x: T1.e'), each
of type T1 -> T2, we say that they are related at the type T1 ->
T2 if, when applied to the same argument of type T1, produce
related results at type T2. (Notice that this is an operational
definition of relatedness; it relies on knowledge of the
evaluation semantics of the language.)

## Soundness and completeness

Once we've defined the logical equivalence relation, we can prove
that it is sound with respect to contextual equivalence. That is,
if programs e1 and e2 are logically equivalent, they're contextually
equivalent, too.

(Soundness wrt contextual equivalence) If e1 and e2 are logically
equivalent, they're also contextually equivalent.

Often, we will also want to prove completeness wrt contextual
equivalence:

(Completeness wrt contextual equivalence) If e1 and e2 are
contextually equivalent, they're logically equivalent.

Finally, we can establish that the two programs that we wish to
prove contextual equivalence of are actually logically equivalent,
and therefore contextually equivalent. (You might wonder why we
actually need completeness in those circumstances -- don't we only
need soundness? For that particular situation, sure. But there are
situations where completeness really comes in handy. For instance,
suppose we have two programs that we know are not logically
equivalent. By taking the contrapositive of the completeness
implication, we can say with certainty that they are not
contextually equivalent, either.)

## The Fundamental Property and other important properties

In order to prove soundness of logical equivalence wrt contextual
equivalence, we have to prove a property of the logical equivalence
relation that is usually known as the Fundamental Property: we must
show that every expression is logically equivalent to *itself*.
This property is exactly the reflexivity property that must be true
of all equivalence relations.

Aside from reflexivity, we also need to prove that the logical
equivalence relation is symmetric (if e1 ~ e2 : T, then e2 ~ e1 : T)
and transitive (if e1 ~ e2 : T and e2 ~ e3 : T, then e1 ~ e3 : T).
Symmetry follows easily from the definition of logical equivalence,
but proving transitivity of a logical relation can be quite hard.
Typically we proceed by proving both soundness and completeness with
respect to contextual equivalence, and then, since contextual
equivalence is easily shown to be transitive, we know that the
logical relation is transitive.

## Back to proving program equivalence

We've reduced the problem of proving program equivalence to defining
a logical equivalence relation that has the Fundamental Property and
is sound and complete. After doing so, proving that a pair of
programs is contextually equivalent is simply a matter of showing
that that pair of programs is, in fact, a member of the relation.

## Polymorphism and other fancy stuff

The logical equivalence relation becomes much harder to define when
dealing with types that are fancier than the ones we've discussed
here so far. Adding polymorphism to our type system, for instance,
requires us to introduce more mathematical machinery and some
auxiliary relations to the definition of logical equivalence. For
one thing, polymorphism makes it harder to know the type at which
two expressions should be related, so we have to introduce an
environment that "remembers" the types with which a program is
actually instantiated as it is being evaluated.

## The need for stratification

Moreover, though, the problem with polymorphic and recursive types
is that they introduce a kind of circularity. Our logical
equivalence relation is defined inductively on types, but in a
system with the kind of polymorphism known as *impredicative*
polymorphism, when we have a type like "forall A. T", occurrences of
A in the "T" portion of the type can be instantiated with any type,
such as "forall A. T" itself, or even a larger type like "forall
A. T * T". Therefore, the induction is no longer well-founded:
there's no guarantee that we'll reach a base case. We therefore
have to find some way of *stratifying* membership in the relation
that will allow it to be defined inductively in a well-founded way.

Of the proposed solutions to this problem, one of the most promising
is the recently developed *step-indexing* technique,
which allows the relation to be defined inductively not only on the
types of programs, but by the number of steps remaining for future
execution of programs. The idea is that we can approximate logical
equivalence by running a program only for a given number of steps.
(It's worth noting that when run for 0 steps, all programs are
logically equivalent and thus contextually equivalent!)

## Multi-language programs

The particular challenge of the setting in which I'm working is that
we have something called "lump" types, which allow us to embed
programs written in a dynamically-typed language inside programs
written in a statically-typed host language. (I'll call the
combined system the "lump language".) Proving equivalence of
programs in this setting presents some additional challenges that
not even the step-indexed model can handle without the addition of
yet more proof machinery. We think that the even more recently
invented *indirection theory* approach might simplify the job,
though.

## Goals

- Given a logical equivalence relation for a simple language like
System F (simply typed lambda calculus plus polymorphism), be
able to prove the Fundamental Property and
soundness/completeness properties. Understand the extra
machinery necessary to handle polymorphic types. Gain enough
fluency in Coq to be able to formalize these proofs in Coq,
possibly using libraries intended for proving properties of
programming languages.
- Understand the step-indexed model for defining logical
relations, and why we need step-indexing for dealing with
recursive and quantified types.
- Understand the lump embedding (in particular, understand the
relationship between the lump type system and
*contracts*).
Understand why step-indexing is insufficient for defining a
well-founded logical relation on terms in the lump language
(it's probably related to the way in which embedding dynamically
typed code in a statically typed language allows us to "smuggle
values past the type system").
- Define a well-founded logical equivalence relation for the lump
language and prove the Fundamental Property, the soundness
property, and hopefully the completeness property. First, try
to do it without the addition of indirection theory; then use
indirection theory. (Maybe try to show that the two models are
equivalent.) Formalize these proofs in Coq using the Mechanized
Semantic Library, which implements the axioms of indirection
theory.
- Fame, fortune, publications, etc.

I'm not necessarily expecting to achieve all of this stuff this
semester, but I'm going to get as far as I can.

Regarding both the website and the research plan, there isn't a lot there yet. But it's a start.