Lindsey Kuper (lindseykuper) wrote,
Lindsey Kuper

Recent Rust hacking

(My posts with the tag mozilla, such as this one, are now being syndicated onto Planet Mozilla Research along with those of my colleagues. Yay!)

This week will have been my fourteenth week (!) on the Rust team at Mozilla. Fourteen weeks is about the length of a normal internship, so now seems like a good time to take stock of what I've accomplished so far. When I left off writing about Rust in April, I'd just spent a couple of weeks learning my way around PLT Redex and getting started on a Redex model of Rust. Working in Redex was a lot of fun, but I felt rather disconnected from the rest of the team, who were all hacking on, you know, the actual implementation of Rust. So it was good to get back to working on Rust proper at the end of those two weeks. Of course, Rust is changing so fast that it's easy to lose track of what's going on if you stop paying attention for even a moment, so when I came back to Rust it was another week or so of catching up before I was really ready to start writing code again.

If you're new here, Rust is a new systems programming language pursuing the trifecta of safe, concurrent, and fast. It has an expressive type system, pattern matching, concurrency primitives, and a lightweight, structural object system. My project for the summer is focused on the object system and type system: I'm designing and implementing support for "self" expressions in Rust, which involves having a notion of self-types, or "the type of the current object". Figuring out an appropriate type for the current object is surprisingly subtle, especially in the presence of object extension. For instance, suppose that you add some more methods or fields to an instance of an object, then call a method on the extended object. If the method you call happens to return self, what's the type of that return value? Is it the type of the extended object, or the type of the original one? Ideally, it would be the type of the extended object, which is a more precise type, but some languages, like Java, aren't able to statically determine that the returned value has the more precise type. (Apparently, it's possible to fake that behavior using generics, but it ain't pretty.) It would be great if Rust could do better.

Since object extension is one of the things that makes determining the type of self hard, I decided to work on object extension first. This alone has taken a pretty long time. Rust compiles to the LLVM intermediate language, and I was forced to confront my ignorance of how our translation to LLVM works.1 As it turns out, it's pretty hairy -- I spent at least a couple of the last eight weeks doing nothing but staring at the translation pass of the compiler and adding hundreds of lines of comments to it. But I can finally report that as of yesterday, Rust supports extending an object with new methods! Here's a complete, if rather uninteresting, Rust program that uses the new feature:

use std;

fn main() {
    obj normal() {
        fn foo() -> int { ret 2; }
    auto my_normal_obj = normal();

    // Extending an object with a new method
    auto my_anon_obj = obj { 
        fn bar() -> int { 
            ret 3;
        with my_normal_obj

    assert ( == 2);
    assert ( == 3);

    auto another_anon_obj = obj {
        fn baz() -> int {
            ret 4;
        with my_anon_obj

    assert (another_anon_obj.baz() == 4);


Here, the expression we're assigning to my_anon_obj is what we're calling an "anonymous object expression" -- it takes my_normal_obj, which is an instance of the object normal, adds a new method to it, and evaluates to a brand-new object. (Note the with my_normal_obj part.) It's not necessary to call any kind of constructor -- the new object simply springs into being. This seems obvious in retrospect, but it caused me a lot of confusion when I was trying to figure out how to translate anonymous object expressions. When normal objects such as normal are translated, the translation is side-effecting: it causes an object constructor function to end up in the generated executable. Translating anonymous object expressions, on the other hand, doesn't do that -- it produces an object "inline", and no constructor is generated. The resulting extended object is still a bona fide object, though, like one you'd get from a constructor, so it can be extended again with more methods, producing another_anon_obj.

There's still a lot of work to do here -- as you can see, I'm not doing anything with self yet. Also conspicuously absent are calls like, say,, which should work fine but which doesn't yet. To do that, I need to put so-called "forwarding slots" in the anonymous object's vtable that forward method calls to the appropriate methods in the object it extended. And we also want to support field extension -- so those things are all on my plate for the rest of the summer. I have nine weeks to go -- we'll see how far I get!

I'm also excited about continuing to work on Rust when the summer is over. By August, I think I'll be in an unusual position as someone who's familiar with the Rust compiler internals, but who has a theoretical background and wants to do theoretical/foundational work. One thing I'm wondering about, for instance, is how well Rust's translation to LLVM preserves the semantics of types. My guess is that since many Rust types compile to one LLVM type, the translation is not particularly semantics-preserving -- so does a semantics-preserving encoding exist in the LLVM type system, and if not, how much more sophisticated would LLVM's type system have to be to make it possible? (And a semantics for LLVM doesn't exist yet, which by itself is a huge open problem that I think a lot of people would like to have a solution for.)

In the more immediate term, though, I'm taking tomorrow off work to compete in the ICFP Programming Contest, which starts today at 5 p.m. local time and goes until 5 p.m. on Sunday. This will be the third year that Alex oniugnip and I have competed as Team K&R2. As soon as our long weekend of hacking is over, Alex is headed off to a conference in Portland. He'll have something like three hours between the time the contest ends and the time his flight leaves, which meets our usual standards for ridiculousness. Who else is playing?

  1. In fact, I didn't know anything about the LLVM compiler infrastructure, so I started by reading the chapter about LLVM from the recently published Architecture of Open Source Applications book. I highly recommend this book; everything I've read in it has been good, and it's available free online. They're also accepting contributions of more chapters.
  2. "Innovating Ideas. Inspiring Results."
Tags: icfp 2011, mozilla

  • Post a new comment


    Anonymous comments are disabled in this journal

    default userpic

    Your reply will be screened

    Your IP address will be recorded