More on Flow-Sensitive Typing

The idea behind flow-sensitive typing in Whiley is to give a statically typed language the look-and-feel of a [[Dynamic programming language|dynamically typed language]] (as much as possible).  The following illustrates:

``` int average([int] items):
v = 0
for i in items:
v = v + items[i]
return v / |items|
```

Here, we see that there are only two type declarations: one for the parameter, and one for the return. Otherwise, it looks much like a dynamically typed language. Variable `v` is declared by assignment to have type `int`. In other words, assigning `0` declares it and gives it the type `int` at that point. We could assign variable `v` other values later on, and it’s type would be automatically updated to reflect this. The following gives an artificial example to illustrate:

``` int f(int x):
y = 1123
// here, y has type int
x = x + y
y = "Hello"
// now, y has type string
return x + |v|
```

A number of interesting issues arise with Whiley’s flow-sensitive typing system, and we’ll look at them in more detail now.

Conditionals

An on obvious question is: what happens when a variable is declared on one side of a conditional? Here’s an example:

```int f(int y):
if y > 0:
x = 123
// following line gives a syntax error
return x + y
```

This code does not compile, since variable `x` may be undefined at the `return` statement.  It gets more interesting if we do this:

```real f(int y):
if y > 0:
x = y
else:
x = 1.23
return x
```

Now, the question is: what type does x have at the return statement? The answer, as expected, is `real` (i.e. because every `int` is also a `real`).

But, wait! What happens if the types assigned on either side are unreleated? Well, in this case, Whiley uses a union type (and see my earlier post for more on this).

Lists

List assignment has some interesting and surprising properties in Whiley.  For example:

```real f([int] ls):
if |ls| > 0:
ls[0] = 1.23
// type of ls is now [real]
...
```

Here, we have a potential assignment from a `real` value to an element of `ls`. Whiley does not complain that this is an error, despite the type of ls (on entry) being `[int]`. Instead, Whiley updates the type of `ls` to reflect this. Thus, since we now have at least one element of type `real`, the whole list is now considered to have type `[real]`.

Records

Things really start to get interesting when we consider records.  For example:

```define Point as {int x, int y}

real f(Point p):
p.x = 1.234
... // type of p now {real x, int y}
```

What we see here is that the type of variable `p` is updated by the assignment to field `x`. The reason for allowing this is to reflect dynamic languages as much as possible. In a dynamic language, such an assignment would be always be permitted and, hence, so it is in Whiley.

Most dynamic languages also have field-creation-by-assignment semantics as well. However, in Whiley, I don’t permit this and, hence, the following is a compile-time error:

```int f(Point p):
p.z = 1
...
```

The reason I don’t allow this, is that I believe it will cause too many subtle errors.  That is, if the programmer meant to assign to variable `x`, but made a typo and assigned variable `z`, then the system would not complain. For me, this would be going too far.

Record Subtyping

In a language like Java, you create data-types using classes.  Each class has a specific name, and classes with different names are distinct.  Inheritance and interfaces can be used to allow some degree of interchangeability, but this remains very static and rigid in nature.  Whiley adopts a much more flexible approach, sometimes known as structural subtyping.  First and foremost, records in Whiley are not associated with names — they are really just macros which expand.  So, for example, this piece of code

```define Point as {int x, int y}

int getX(Point p):
return p.x
```

is essentially identical to this:

```int getX({int x, int y} p):
return p.x
```

In other words, the name `Point` has no semantic meaning in the language. The beauty of this is that we can easily interchange Points with types defined elsewhere, potentially by others. For example, suppose in some other file we have this definition:

```define Point2D as {int x, int y}
```

Well, any method that accepts a `Point2D` will also accept an `Point` instance (and vice versa) — despite the fact that these types may come from files which were developed in isolation. There’s no need to write code which converts between `Point`s and `Point2D`s, etc. Similarly, subtyping of records works as expected:

```define RealPoint2D as {real x, real y}
```

Here, we can pass `Point` instances into functions which accept `RealPoint2D`s (since every `Point` is a `RealPoint2D`). However, we cannot pass `RealPoint2D` instancess into functions expecting Points as, for example, `{x:2.2,y:1.0}` is not an instance of `Point`.

Finally, Whiley also also supports subtyping between records with different numbers of fields:

```define Point3D as {int x, int y, int z}
```

A `Point3D` instance can be passed into a function expecting a `Point` or a `Point2D`, etc. Again, no awkward conversions between different types are necessary, making life just that bit simpler…

Conclusion

I’m a big fan of both dynamic and static typing (for different reasons, obviously).  I see no reason why we can’t get the best of both worlds, and this is what I’m trying to do with Whiley’s type system.  As usual, only time will tell if it really makes sense or not.  For now, I’ll let you decide…

7 comments to More on Flow-Sensitive Typing

• Colin

I’m concerned and curious about your list example, where you say you allow a strong update to the type of a list from [int] to [real]. If the list were only referenced in the current function, you’re fine. But if not, you then have to deal with propagating the update to the caller in your example, and to any number of other aliases that might (now incorrectly) assume the list still has type [int]. You could communicate the argument type change in your example to the caller easily enough by giving the function an appropriate effect type, but you’re still stuck with the aliasing issue. Do you have a solution to that yet?

• Hi Colin … Great question! In a language where lists are objects and can be aliased, you are quite right — this would be a problem. However, in Whiley, this is not the case. Lists are values (like int or real) and cannot be aliased. Assigning an item to a list effectively creates a completely new list with all the same items (except the newly assigned one). I say effectively, because under the hood various optimisations can be used to reduce the overhead of doing this. This is similar to the way lists are implemented in e.g. Clojure.

• Colin

Ah yes, immutable lists would certainly do the trick. Thanks!

• […] Problem on Typing vs Verification By Dave, on November 14th, 2010 Whiley uses a flow-sensitive type system, and a verification engine to enable Extended static checking. My desire in designing these two […]

• My first impression is that this seems to have things in common with the Hindley-Milner type inference used in Scala, for instance. Could you elaborate on the difference between your approach and theirs?
Is it just that this approach strictly infers types in the direction of the control flow?

• Hey Johan,

Yes, it does have some similarity with type inference, as used in e.g. Scala. The main difference is that variables can have different types at different points, and that type tests retype variables. E.g. on the true branch of an “x ~= int” type test, variable x is automatically retyped to be an int. I’ve recently been working on a tech report discussing this all in detail here:

http://www.ecs.vuw.ac.nz/~djp/files/ECSTR10-23.pdf

Cheers!

Dave

• Hi Dave,
That clears it up a bit and gives me something to read, thanks!

Johan