Articles

Reference Lifetimes in Whiley

The concept of lifetimes was pioneered in the Rust programming language, and builds on earlier notions of regions and ownership types. Lifetimes are considered one of Rust’s “most unique and compelling features”.

Recently, the concept of reference lifetimes has been added to Whiley by Sebastian Schweizer (@SebastianS90). In this post, I’m going to try and summarise the basic idea and how it looks in Whiley. To start with, let’s consider the following (broken) program written in C:

int *getPointer() {
  int local = 0;
  return &local
}

This method returns the address of a local variable thereby creating a dangling pointer. We say that the returned pointer outlives the lifetime of the data it refers to.  The purpose of reference lifetimes is to ensure dangling pointers cannot be created. Or, put it another way, to ensure memory deallocation can be handled safely without a garbage collector.

Lifetime Syntax

In Whiley, we can’t take the address of a local variable so we cannot exactly recreate the above example. A similar example using the new lifetime syntax would be:

method getReference() -> &int:
   &this:int local = this:new 0
   return local

This defines a simple method which allocates a new integer on the heap and returns a reference to it. As expected, this example now produces a compile-time error. This is because the reference local is declared to have lifetime this. This means the data to which it refers has the same lifetime as the enclosing method and, hence, we cannot return a reference to it.

Previously Whiley supported references without lifetimes. For example, we would have declared local above to be &int rather than &this:int. Since there is no deallocation primitive in Whiley, this meant that all heap allocated data had to be garbage collected. With lifetimes we can now avoid garbage collection when we want to (e.g. on an embedded system). However, Whiley still supports references of the form &int. These are now syntactic sugar for &*:int, where * is the “global” lifetime.

Lifetime Parameters

Like Rust, Whiley allows methods to declare lifetime parameters. The following illustrates:

method <l1,l2> swap(&l1:int r1, &l2:int r2):
    int tmp = *r1
    *r1 = *r2
    *r2 = tmp

The above method accepts two references of lifetimes l1 and l2. These lifetimes must be provided as arguments when calling the method, so the body can just assume they exist. For example, we could call the method as follows:

method caller():
    &this:int i1 = this:new 1
    &this:int i2 = this:new 2
    //
    swap<this,this>(i1,i2)

Here, we’ve created two references with lifetime this and passed them into swap(), providing this as the lifetime argument.

Whiley also now supports the notion of named blocks, which can be used for identifying a subscope within a method.  For example:

method caller():
    &this:int i1 = this:new 1
    inner:
       &inner:int i2 = inner:new 2
       //
       swap<this,inner>(i1,i2)

Here, the inner scope identifies a smaller lifetime than that of the enclosing method. We say that inner is outlived by the lifetime of method body (this).

Lifetime Inference

The Whiley compiler will try to infer lifetime arguments where possible. For example, the above method caller() can be written as:

method caller():
    &this:int i1 = this:new 1
    inner:
       &inner:int i2 = inner:new 2
       //
       swap(i1,i2)

Here, the lifetimes necessary for swap() have been omitted and, instead, are inferred by the compiler from the arguments i1 and i2. In general, lifetime inference works pretty well, although there are cases where ambiguity arises and the compiler cannot infer correct lifetimes. In such cases, it reports an error indicating the ambiguity.

Ownership (or not)

In Rust, the concept of lifetimes, ownership and borrowing are closely tied together and, in fact, can be hard to distinguish. Roughly speaking, ownership ensures there is at most one mutable reference to any data allocated on the heap, whilst borrowing is the mechanism by which temporary mutable and non-mutable references are obtained.

The lifetime extension to Whiley does not include the concept of ownership. This is because lifetimes are being used primarily to ensure safe memory deallocation. In the future, we may still introduce ownership into Whiley as, for example, it is important for preventing data races. However, it is likely that ownership in Whiley will be quite different from ownership in Rust. In particular, rather than using a simplistic flow analysis (i.e. as in Rust’s borrow checker), we can use Whiley’s more sophisticated verifier to help enforce ownership.

Conclusion

The introduction of lifetimes in Whiley is a big step in the evolution of the language which, eventually, will pave the way for running Whiley on platforms which don’t support garbage collection (i.e. as native code). And, over the next few months, I’ll be talking more about how this is going to work.

8 comments to Reference Lifetimes in Whiley

  • Great post! This is really interesting. One small nit about Rust:

    > ownership ensures there is at most one mutable reference to any data allocated on the heap,

    This is not _quite_ true. Ownership answers the question “who is responsible for deallocating this resource?” Borrowing makes sure that there is only at most one mutable reference, heap or not. Lifetimes make sure that references are always valid.

  • Hey Steve,

    Hmmm, interesting. Distinguishing these things is what I find a little tricky. For example, my understanding is that “ownership” is the reason that this generates a compile-time error:

    let v = vec![1, 2, 3];
    let v2 = v;
    let x = v[0];
    

    This basically comes from the Rust book here under “Move Semantics” in the chapter on “Ownership”. Specifically, it says this:

    Rust ensures that there is exactly one binding to any given resource

    I’m interested in what you think here, since I want to get my terminology right. I’m not an expert on Rust, but have been paying some attention to this aspect at least … 🙂

  • Hmmm, so thinking about this more, I think it’s that I’ve overloaded the term “mutable reference”. What I really meant was something more like this:

    Ownership ensures that, under-the-hood, there is only one pointer to the data in question through which you can modify it.

    It’s still not really right, but it’s better I think.

  • Hmmm, so thinking about this more, I think it’s that I’ve overloaded the term “mutable reference”. What I really meant was something more like this:

    Yes, this is a kind of confusing part of Rust. Technically x: &mut T and mut x: T mean different things. If the scope has ownership over a binding of x: T, and there are no previous borrows to x, it should be perfectly safe to mutate x, without ‘unfreezing’ it with a let mut x = x. Immutability is just a programmer aid when it comes to variables owned by the current scope.

    Sorry if I explained that in a confusing way. You might find it helpful to read Niko’s post about the topic, which gave rise to the heated debate we now jokingly refer to as the ‘mutpocalypse’. 🙂

  • Thanks Brendan — I’ll check those links out.

  • Dave,

    So, the way you’ve phrased it and the way I’ve phrased it are similar, but there’s also a difference that’s important. “The right to destroy” _is_ usually about there only being one binding to a resource, because you want to pair exactly one allocate with one free. However, by using unsafe code under the hood, more complex ownership types can be constructed. Consider Rc and Arc: from the “only one binding per resource” front, they make no sense, but from the “right to destroy” perspective, these types ensure that after all bindings that refer to the data are dropped, only one final destruction happens. Does that make sense?

  • kito

    i don’t use rust because of the borrow checker insanity. takes the fun out of programming.

    while lifetimes look interesting, i don’t think i like the proposed syntax. would it be possible to avoid sigils?

  • Hey Steve,

    Back after some zzzzz…

    Does that make sense?

    Yes, sort of. I guess when I’m thinking about this, I don’t usually think about the edge cases like Cell for example. Anyway, i think it’s been a helpful discussion for me. I’ve been reading the links that Brendan suggested, which has provided me with some more useful background.

    I guess the upshot of this discussion is that with the lifetimes extension Whiley does have ownership in a sense then. That is, the method in which an object is allocated is the owner since it is responsible for deallocation…

Leave a Reply

 

 

 

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>