Lindsey Kuper (lindseykuper) wrote,
Lindsey Kuper

The Rust object system struggles to its feet (plus, a little brainteaser)

When I last wrote about my work on Rust a month ago, I had just finished implementing some very basic support for extending Rust objects with new methods. In the grand tradition of computer science publishing, the example program I gave was one that very carefully avoided doing any of the things that didn't work yet. The most egregious issue was that if you extended an object with new methods, you could, um, no longer call its original methods. (Really. Well, I had to start somewhere.) Even once that was fixed, a lot of things were wrong. Outer methods couldn't refer to inner ones, since object extension didn't play nicely with self. And method overriding didn't work at all.

In the last month, I've made all of those things work to some extent. For instance, programs like these two from the Rust test suite that exercise the features I just mentioned can now compile, run, and exit normally, having behaved as expected. (And do, on three platforms, whenever someone lands a change to the compiler.) Hooray!

A lot of work remains to be done on the object system, of course, including a pretty huge issue that I've been putting off fixing. Everyone loves a brainteaser, right? Consider the following Rust program:

use std;

fn main() {

    obj cat() {
        fn ack() -> str {
            ret "ack";
        fn meow() -> str {
            ret "meow";
        fn zzz() -> str {
            ret self.meow();

    auto shortcat = cat();

    auto longcat = obj() {
        fn lol() -> str {
            ret "lol";
        fn nyan() -> str {
            ret "nyan";
        with shortcat

    assert (shortcat.zzz() == "meow");
    assert (longcat.zzz() == "meow");

Here we've got an object with three methods, shortcat, that's created by calling the cat() object constructor. When the call to shortcat.zzz() is made, we look up zzz in shortcat's vtable and run the code associated with it, which directs us to look up meow in shortcat's vtable and run the code associated with it. Finally, that code returns the string "meow", and the first assertion succeeds.

Now consider the call to longcat.zzz(). The longcat object was created by extending shortcat with two new methods, lol and nyan. Neither one of the new methods overrides any of shortcat's methods, and neither one should interfere with zzz or meow. So we would expect the call to longcat.zzz() to return "meow", too. Instead, in the current buggy implementation, it returns something else, and the assertion fails.

So, the brainteaser: can you guess what longcat.zzz() returns instead of "meow", and why? (Note that the program compiles with no errors or warnings, doesn't segfault when run, and doesn't raise any errors under Valgrind.)

Scroll down for the answer and an explanation...

This is entirely Paul Stansifer's fault.

First, note that longcat has five entries in its vtable, appearing in alphabetical order: ack, lol, meow, nyan, zzz. Of those, lol and nyan are "normal" entries, and the others are "forwarding functions" that, roughly speaking, forward us along to the appropriate vtable entries from shortcat. (Implementing this forwarding behavior has taken me much of the last month.) So, when we call longcat.zzz(), we get forwarded along to shortcat's vtable entry for zzz, which contains the compiled code for ret self.meow().

This is where the problem arises: the code we're looking at was compiled under the assumption that self is shortcat, since of course that's what it was when shortcat's vtable was created. shortcat's vtable only has three methods, ack, meow, and zzz, in that order. The call to self.meow() therefore compiled to, roughly, "do whatever the second slot in my vtable says to do". But since self is now longcat, the second slot in its vtable is not meow but lol. So longcat.zzz() returns "lol", and the assertion fails. Interestingly, as I mentioned, this code runs cleanly under Valgrind -- no memory corruption or anything like that. Furthermore, the assertion might have accidentally succeeded if I'd chosen names for the methods that alphabetize differently. So this is a truly nasty bug.

To fix the mismatch, we need the self in shortcat.zzz() to have the same type as shortcat does. By "same type", I mean that its vtable must have the same number of methods, with the same names, in the same order, and having the same types as the original. But self is supposed to be longcat at that point, so how are we going to manage to give it the same type as shortcat? The answer, unsurprisingly, is another level of indirection. We give self a vtable that has three backwarding functions, one for each of the original three methods ack, meow, and zzz, which point to the corresponding entries in longcat's vtable. From there, of course, they will forward right back to shortcat's vtable -- unless longcat has overridden one of them, in which case we'd use the overriding entry from longcat. (There's no overriding happening in this example, but we need a solution that would work in case there were.) With luck, I'll land this backwarding behavior in Rust sometime in the next week. Stay tuned for the next exciting episode of Lindsey Implements An Object System!

Tags: mozilla, programming

  • Post a new comment


    Anonymous comments are disabled in this journal

    default userpic

    Your reply will be screened

    Your IP address will be recorded