Existential types seem like a natural mechanism for ensuring
representation independence of programs. With existential types, we get the
property that *if two programs have the same existential type, no
program context can tell them apart*. This is a powerful idea. It
corresponds nicely to what we might intuitively expect representation independence to
mean: clients of a module shouldn't depend on the module's internal
representation. If we make it so that clients have no way of finding
out or affecting details of the module's implementation, then we get
representation independence for free: we can be sure that clients don't depend on module
internals, because they *can't*.

But universal types are another story. They don't seem like such a
natural mechanism for representation independence. With
existential types, we have the property that no program context can distinguish two expressions with the same existential type; that is, the existentially typed expressions are contextually equivalent. But with universal types, we have the property that no *two program contexts* can cause the *same* universally typed expression to behave differently. Representation independence seems to be all about flexibility: Implement your module
any way you want! Pick any internal representation type you want! Do it again! As
long as all your implementations present the same interface,
you're okay! Let a thousand flowers bloom! But a universally typed expression is one that stubbornly does the same thing, that *must* do the same thing, regardless of what type it's instantiated with. Expressions of universal type can't know or care what type they're instantiated with; they can't decide to behave this way or that way at run-time depending on how they were instantiated. This is the *parametricity* property.

Yet, we're told in paper after paper that the representation independence we get from existential types is a consequence of the parametricity we get from universal types. How can that be? They seem so different in spirit; how could the former possibly follow from the latter?

Let me show you it!

To solve this mystery, first we have to observe that existential types can
be encoded using universal types. We can encode the existential type `exists alpha. tau`

as the following universal type (thanks, TaPL 24.3):

`forall beta. (forall alpha. tau -> beta) -> beta`

That thing's pretty hairy, but let's think about what it means. Suppose we have some
expression `g`

that has ```
forall beta. (forall alpha. tau -> beta) ->
beta
```

as its type. Let's start with the `forall beta`

on the front. It means that `g`

is waiting to be applied to (or *instantiated with*) a type -- any type -- and we can't use it until we
do. Suppose we instantiate `g`

with `Nat`

, the type of natural numbers, for the time being. We do the
type application, replacing the occurrence of `beta`

with `Nat`

, and now
we've got an expression we'll call `g'`

, which has type

`(forall alpha. tau -> Nat) -> Nat`

What's `g'`

? We can see that it has an arrow type, so it must be a
function. In particular, it's a function waiting to be passed an
argument that's itself a function of type `forall alpha. tau -> Nat`

.
As soon as we give `g'`

one of those, it will be obliged to give us a
`Nat`

sooner or later.

Note that it doesn't matter *what* function of type ```
forall
alpha. tau -> Nat
```

we give to `g'`

-- as long as we give it
*some* function of that type, we'll get a `Nat`

back.
But furthermore -- and this is important! -- when we give `g'`

that
function of type `forall alpha. tau -> Nat`

, it *won't* be
allowed to simply throw its argument away and give us our `Nat`

. For
example, `g'`

*cannot* just be

`lambda f: (forall alpha. tau -> Nat). 5`

even though that expression seems to have the right type. Why not? Because, remember, we
picked `Nat`

arbitrarily. We could have just as easily picked `Bool`

, and
then we'd need `g'`

to return something of type `Bool`

. So having `5`

as
the body of `g'`

isn't going to cut it: we need the body of `g'`

to be of
whatever type we instantiated `beta`

with in the first place. What's
the only certain way to get something of that type? By applying `f`

,
of course!

Oh, wait, but `f`

's type is `forall alpha. tau -> Nat`

. Since it starts with `forall`

, we need to apply it to a type before we can use it. Any type will do. Let's choose the type `Squishy`

. Now we almost know how to write down `g'`

:

`lambda f: (forall alpha. tau -> int). f Squishy ...`

How do we fill in the rest? Well, after instantiating `f`

with `Squishy`

,
we get a function that will take a value of the type
`tau`

-but-with-occurrences-of-`alpha`

-replaced-with-`Squishy`

(or ```
tau[alpha
:= Squishy]
```

, for short), and will give us a `Nat`

in return. We really
need that `Nat`

, because we need to be able to return something of type `Nat`

. So there had better damn well be a value available of type
`tau[alpha := Squishy]`

. Suppose that we have one handy, and let's call it `glork`

.

That leaves us with the following expression `g'`

:

`lambda f: (forall alpha. tau -> Nat). f Squishy glork`

Note that the expression `f Squishy glork`

evaluates to an `int`

-- just what we want.

Our whole expression `g`

of type ```
forall beta. (forall alpha. tau ->
beta) -> beta
```

, then, is

`Lambda beta. lambda f: (forall alpha. tau -> beta). f Squishy glork`

where `Squishy`

is any type whatsoever and `glork`

is a value of type
`tau[alpha := Squishy]`

. In other words, *we can manufacture a value of type
forall beta. (forall alpha. tau -> beta) -> beta*, just as long as we
have

`Squishy`

and `glork`

. How do those requirements stack up to the
requirements needed for manufacturing a value of type `exists alpha. tau`

?*They're exactly the same.* A value of type `exists alpha. tau`

is a package comprising some type `S`

, together with a term of type
`tau[alpha := S]`

. (These components correspond, respectively, to the
hidden, internal representation type of a module and the
implementation of the operations that its interface exposes. What's
shown to the world? Just the *signature*: `alpha`

, the name of
the abstract type, and `tau`

, the types of its operations.)

You need both of those components to get a value of type `exists alpha. tau`

. So it suddenly makes sense that `forall beta. (forall alpha. tau -> beta) -> beta`

is our encoding for `exists alpha. tau`

, because you need
the same two things to create values of the universally encoded type as you need to create values of the existential type.

Okay, so now that we've convinced ourselves that the encoding is
right, let's consider what would happen in the universally encoded
type *if parametricity didn't hold* -- if we had a way of finding out, at run-time, what types we were instantiating our universally typed expressions with. First of all, let's consider what would happen if we could find out what `beta`

was at run-time. If we could do that, we might be able to manufacture a value of type `beta`

, without having to invoke `f`

. And if we didn't have to invoke `f`

, then we wouldn't need `Squishy`

and `glork`

. So, the fact that we don't and can't know what `beta`

is -- parametricity! -- ensures that `Squishy`

and `glork`

have to exist.

That's half of it. Now consider what would happen if we could find out what `alpha`

was at run-time. If we could, then we wouldn't need `Squishy`

, because `alpha`

would *be* `Squishy`

, and knowing that, we might be able to manufacture `glork`

. But, again, thanks to parametricity, we don't and can't know what `alpha`

is, so we're once again assured that `Squishy`

and `glork`

must exist.

Parametricity makes existential-style data abstraction possible! Hooray for parametricity!