(My posts with the tag mozilla,
such as this one, are now being syndicated onto Planet Mozilla Research along with those of my colleagues. Yay!)
This week will have been my fourteenth week (!) on the Rust team at Mozilla. Fourteen
weeks is about the length of a normal internship, so now seems like a
good time to take stock of what I've accomplished so far. When I left off writing about Rust in April,
I'd just spent a couple of weeks learning my way around PLT Redex and getting started
on a Redex model of Rust. Working in Redex was a lot of fun, but I felt rather disconnected from the rest of the team, who were
all hacking on, you know, the actual implementation of Rust.
So it was good to get back to working on Rust proper at the end of
those two weeks. Of course, Rust is changing so fast that it's easy
to lose track of what's going on if you stop paying attention for even
a moment, so when I came back to Rust it was another week or so of catching up before
I was really ready to start writing code again.
If you're new here, Rust is a new systems programming language
pursuing the trifecta of safe, concurrent, and
fast. It has an expressive type system, pattern matching,
concurrency primitives, and a lightweight, structural object system.
My project for the summer is focused on the object system and type system: I'm
designing and implementing support for "self" expressions in Rust,
which involves having a notion of self-types, or "the type of the
current object". Figuring out an appropriate type for the current
object is surprisingly subtle, especially in the presence of object
extension. For instance, suppose that you add some more methods or
fields to an instance of an object, then call a method on the extended
object. If the method you call happens to return self,
what's the type of that return value? Is it the type of the extended
object, or the type of the original one? Ideally, it would be the
type of the extended object, which is a more precise type, but some
languages, like Java, aren't able to statically determine that the
returned value has the more precise type. (Apparently, it's possible
to fake
that behavior using generics, but it ain't pretty.) It would be
great if Rust could do better.
Since object extension is one of the things that makes determining
the type of self hard, I decided to work on object
extension first. This alone has taken a pretty long time. Rust
compiles to the LLVM intermediate language, and I was forced to
confront my ignorance of how our translation to LLVM works.1 As it turns out, it's pretty hairy -- I spent at least a couple of the last eight weeks
doing nothing but staring at the translation pass of the compiler and
adding hundreds of lines of comments to it. But I can finally report that as of
yesterday, Rust supports extending an object with new methods! Here's
a complete, if rather uninteresting, Rust program that uses the new
feature:
use std;
fn main() {
obj normal() {
fn foo() -> int { ret 2; }
}
auto my_normal_obj = normal();
// Extending an object with a new method
auto my_anon_obj = obj {
fn bar() -> int {
ret 3;
}
with my_normal_obj
};
assert (my_normal_obj.foo() == 2);
assert (my_anon_obj.bar() == 3);
auto another_anon_obj = obj {
fn baz() -> int {
ret 4;
}
with my_anon_obj
};
assert (another_anon_obj.baz() == 4);
}
Here, the expression we're assigning to my_anon_obj is
what we're calling an "anonymous object expression" -- it takes
my_normal_obj, which is an instance of the object
normal, adds a new method to it, and evaluates to a
brand-new object. (Note the with my_normal_obj part.)
It's not necessary to call any kind of constructor -- the new object
simply springs into being. This seems obvious in retrospect, but it
caused me a lot of confusion when I was trying to figure out how to
translate anonymous object expressions. When normal objects such as
normal are translated, the translation is side-effecting:
it causes an object constructor function to end up in the generated
executable. Translating anonymous object expressions, on the other
hand, doesn't do that -- it produces an object "inline", and no
constructor is generated. The resulting extended object is still a
bona fide object, though, like one you'd get from a constructor, so it
can be extended again with more methods, producing
another_anon_obj.
There's still a lot of work to do here -- as you can see, I'm not
doing anything with self yet. Also conspicuously absent
are calls like, say, my_anon_obj.foo(), which
should work fine but which doesn't yet. To do that, I need
to put so-called "forwarding slots" in the anonymous object's vtable
that forward method calls to the appropriate methods in the object it
extended. And we also want to support field extension -- so those
things are all on my plate for the rest of the summer. I have nine weeks to go -- we'll see how far I get!
I'm also excited about continuing to work on Rust when the summer
is over. By August, I think I'll be in an unusual position as someone
who's familiar with the Rust compiler internals, but who has a
theoretical background and wants to do theoretical/foundational work. One thing I'm wondering about, for instance, is how well Rust's
translation to LLVM preserves the semantics of types. My guess is
that since many Rust types compile to one LLVM type, the translation is not particularly semantics-preserving -- so does a
semantics-preserving encoding exist in the LLVM type system, and if not, how much more
sophisticated would LLVM's type system have to be to make it possible?
(And a semantics for LLVM doesn't exist yet, which by itself is a huge
open problem that I think a lot of
people would like to have a solution for.)
In the more immediate term, though, I'm taking tomorrow off work to compete in the ICFP Programming Contest, which starts today at 5 p.m. local time and goes until 5 p.m. on Sunday. This will be the third year that Alex oniugnip and I have competed
as Team K&R2. As soon as our long weekend of hacking is over, Alex is headed off to a conference in Portland. He'll have something like three hours between the time the contest ends and the time his flight leaves, which meets our usual standards for ridiculousness. Who else is playing?
|