Proposed Syntax Changes for Whiley
Now that I’ve had the chance to write a reasonable amount of code in Whiley, it is time to reflect on some things I don’t like. In particular, there are a number of issues with the syntax which I’d like to improve which I’ll document here.
Function & Method Declarations
Currently, the syntax for a function or method in Whiley looks something like this:
int abs(int x)
// Return value cannot be negative
ensures $ >= 0:
//
if x >= 0:
return x
else:
return -x
Although I quite like this syntax, I very much dislike the use of $
for indicating the return value in a post-condition (i.e. ensures
clause). It occurred to me that explicit naming of the return value would be better (in fact, Dafny does this already). So, my proposed new syntax for function declarations is thus:
function abs(int x) => (int y)
// Return value cannot be negative
ensures y >= 0:
//
if x >= 0:
return x
else:
return -x
Strictly speaking, the function
keyword is not necessary, however I think it adds some clarity. In particular, functions (which are pure) now contrast explicitly with methods (which may have side-effects):
method read(String filename) => ([byte] y)
//
f = File.Reader(filename)
...
Note, the syntax for method
declarations is still not finalised, and I’ve been discussing it at length over in the issue tracker (see #204, #309 and #310).
The final aspect of the proposed syntax change is that you can omit the return value when it’s void
. For example, this:
void ::f(Reader r):
...
becomes this:
method f(Reader r):
...
Although this is hardly a significant issue, is still a nice abbreviation.
Type and Constant Declarations
One of things I really like about the current syntax for Whiley is the use of the define
statement, which is simple and clean. However, it turns out there were some issues I hadn’t considered. The issue revolves around the fact that the define
keyword is overloaded. For example:
define PI as 3.1415926536 // defines a constant
define nat as int where $ >= 0 // defines a type!
nat f():
return Math.round(PI)
Here, we see define
being used for defining a new constant, as well as a new type. This is problematic, since the compiler must figure out which it is. Of course, it can do that but, unfortunately, this leads to difficult to understand error messages. The compiler initially assumes a define
statement defines a type and, if this fails, backtracks and assumes it defines a constant. For example:
define Point as {int x, int y,}
This produces the following cryptic error message:
./test.whiley:1: unrecognised term (int)
define Point as {int x, int y,}
^^^
The programmer was trying to define a new Point
type, but accidentally left a comma at the end. The error message is confusing, because it appears to say the int
type doesn’t exist! Ultimately, the problem is that the compiler initially tried to parse it as a type, but failed. So, it backtracked and tried for a constant which also failed, this time leaving an error message about why {int x, int y,}
is not a well-formed expression.
The proposed syntax for constants and type declarations tries to be consistent with that for functions and methods:
constant PI is 3.1415926536 // defines a constant
type nat is (int x) where x >= 0 // defines a type!
With this syntax, there is no ambiguity between what is supposed to be a constant and what is supposed to be a type. Furthermore, I’ve removed the $
syntax in favour of explicitly named variables.
Local Variable Declarations
One of the unusual aspects of the current syntax is that you cannot declare local variables. Whilst this is nice, it does causes problems. One of these arises in verification, and wasn’t something I initially considered. For example:
define nat as int where $ >= 0
nat sum([nat] xs):
r = 0 // result
i = 0 // index
while i < |xs| where i >= 0 && r >= 0:
r = r + xs[i]
i = i + 1
//
return r
This is a simple example illustrating that summing over a list of natural numbers yields a natural number. The loop invariants are, alas, necessary for the verifier to accept this program. In principle, the compiler could attempt to infer these loop invariants since they are simple.
Another option (which I prefer) is to allow variable declarations:
nat sum([nat] xs):
nat r = 0 // result
nat i = 0 // index
while i < |xs|:
...
Since r
and i
are both declared to be of type nat
, the loop invariants are no longer required since the verifier will enforce the nat
constraints at all program points.
There are still some unresolved questions here. Firstly, I will probably still allow variables to be declared as var
, which means they can take any type at any point. Furthermore, flow typing will still be used to restrict the type of a variable in various ways. For example:
int get(int|null x):
if x is int:
return x // automatically retyped
else:
return 0
And, similarly:
int f():
int|null x = null
... // here, x has type null
x = 1
... // now, x has type int
return x
Finally, an interesting question is whether or not there is any difference between the declaration any x
and var x
. I’ll have to think about this!
References
Currently, reference types are denoted with the ref
, like so:
define Point as {int x, int y}
void ::shift(ref Point p, int amount):
p->x = p->x + amount
Here, method m
accepts a reference to a Point
object. Therefore, by assigning through it, we can produce a side-effect. My proposed syntax change is inspired by the syntax for borrowed pointers in Rust:
type Point is {int x, int y}
method shift(&Point p, int amount):
p->x = p->x + amount
For some reason, I prefer &
over e.g. *
(as in C). Also, it remains an open question as to whether or not to stick with the C-like notation of ->
for dereferencing fields and, likewise, *
for general dereferencing.
Thoughts and comments welcome!!