Lindsey Kuper (lindseykuper) wrote,
Lindsey Kuper

Existential freedom

Existential types seem like a natural mechanism for ensuring representation independence of programs. With existential types, we get the property that if two programs have the same existential type, no program context can tell them apart. This is a powerful idea. It corresponds nicely to what we might intuitively expect representation independence to mean: clients of a module shouldn't depend on the module's internal representation. If we make it so that clients have no way of finding out or affecting details of the module's implementation, then we get representation independence for free: we can be sure that clients don't depend on module internals, because they can't.

But universal types are another story. They don't seem like such a natural mechanism for representation independence. With existential types, we have the property that no program context can distinguish two expressions with the same existential type; that is, the existentially typed expressions are contextually equivalent. But with universal types, we have the property that no two program contexts can cause the same universally typed expression to behave differently. Representation independence seems to be all about flexibility: Implement your module any way you want! Pick any internal representation type you want! Do it again! As long as all your implementations present the same interface, you're okay! Let a thousand flowers bloom! But a universally typed expression is one that stubbornly does the same thing, that must do the same thing, regardless of what type it's instantiated with. Expressions of universal type can't know or care what type they're instantiated with; they can't decide to behave this way or that way at run-time depending on how they were instantiated. This is the parametricity property.

Yet, we're told in paper after paper that the representation independence we get from existential types is a consequence of the parametricity we get from universal types. How can that be? They seem so different in spirit; how could the former possibly follow from the latter?

Let me show you it!

To solve this mystery, first we have to observe that existential types can be encoded using universal types. We can encode the existential type exists alpha. tau as the following universal type (thanks, TaPL 24.3):

forall beta. (forall alpha. tau -> beta) -> beta

That thing's pretty hairy, but let's think about what it means. Suppose we have some expression g that has forall beta. (forall alpha. tau -> beta) -> beta as its type. Let's start with the forall beta on the front. It means that g is waiting to be applied to (or instantiated with) a type -- any type -- and we can't use it until we do. Suppose we instantiate g with Nat, the type of natural numbers, for the time being. We do the type application, replacing the occurrence of beta with Nat, and now we've got an expression we'll call g', which has type

(forall alpha. tau -> Nat) -> Nat

What's g'? We can see that it has an arrow type, so it must be a function. In particular, it's a function waiting to be passed an argument that's itself a function of type forall alpha. tau -> Nat. As soon as we give g' one of those, it will be obliged to give us a Nat sooner or later.

Note that it doesn't matter what function of type forall alpha. tau -> Nat we give to g' -- as long as we give it some function of that type, we'll get a Nat back. But furthermore -- and this is important! -- when we give g' that function of type forall alpha. tau -> Nat, it won't be allowed to simply throw its argument away and give us our Nat. For example, g' cannot just be

lambda f: (forall alpha. tau -> Nat). 5

even though that expression seems to have the right type. Why not? Because, remember, we picked Nat arbitrarily. We could have just as easily picked Bool, and then we'd need g' to return something of type Bool. So having 5 as the body of g' isn't going to cut it: we need the body of g' to be of whatever type we instantiated beta with in the first place. What's the only certain way to get something of that type? By applying f, of course!

Oh, wait, but f's type is forall alpha. tau -> Nat. Since it starts with forall, we need to apply it to a type before we can use it. Any type will do. Let's choose the type Squishy. Now we almost know how to write down g':

lambda f: (forall alpha. tau -> int). f Squishy ...

How do we fill in the rest? Well, after instantiating f with Squishy, we get a function that will take a value of the type tau-but-with-occurrences-of-alpha-replaced-with-Squishy (or tau[alpha := Squishy], for short), and will give us a Nat in return. We really need that Nat, because we need to be able to return something of type Nat. So there had better damn well be a value available of type tau[alpha := Squishy]. Suppose that we have one handy, and let's call it glork.

That leaves us with the following expression g':

lambda f: (forall alpha. tau -> Nat). f Squishy glork

Note that the expression f Squishy glork evaluates to an int -- just what we want.

Our whole expression g of type forall beta. (forall alpha. tau -> beta) -> beta, then, is

Lambda beta. lambda f: (forall alpha. tau -> beta). f Squishy glork

where Squishy is any type whatsoever and glork is a value of type tau[alpha := Squishy]. In other words, we can manufacture a value of type forall beta. (forall alpha. tau -> beta) -> beta, just as long as we have Squishy and glork. How do those requirements stack up to the requirements needed for manufacturing a value of type exists alpha. tau?

They're exactly the same. A value of type exists alpha. tau is a package comprising some type S, together with a term of type tau[alpha := S]. (These components correspond, respectively, to the hidden, internal representation type of a module and the implementation of the operations that its interface exposes. What's shown to the world? Just the signature: alpha, the name of the abstract type, and tau, the types of its operations.)

You need both of those components to get a value of type exists alpha. tau. So it suddenly makes sense that forall beta. (forall alpha. tau -> beta) -> beta is our encoding for exists alpha. tau, because you need the same two things to create values of the universally encoded type as you need to create values of the existential type.

Okay, so now that we've convinced ourselves that the encoding is right, let's consider what would happen in the universally encoded type if parametricity didn't hold -- if we had a way of finding out, at run-time, what types we were instantiating our universally typed expressions with. First of all, let's consider what would happen if we could find out what beta was at run-time. If we could do that, we might be able to manufacture a value of type beta, without having to invoke f. And if we didn't have to invoke f, then we wouldn't need Squishy and glork. So, the fact that we don't and can't know what beta is -- parametricity! -- ensures that Squishy and glork have to exist.

That's half of it. Now consider what would happen if we could find out what alpha was at run-time. If we could, then we wouldn't need Squishy, because alpha would be Squishy, and knowing that, we might be able to manufacture glork. But, again, thanks to parametricity, we don't and can't know what alpha is, so we're once again assured that Squishy and glork must exist.

Parametricity makes existential-style data abstraction possible! Hooray for parametricity!

Tags: research

  • Post a new comment


    Anonymous comments are disabled in this journal

    default userpic

    Your reply will be screened

    Your IP address will be recorded