I was having an interesting discussion with a colleague today about various aspects of Whiley, and we came up with an interesting bit of example code which is something of a puzzler. Consider these two different versions of a function f(any)
:
Version 1:
int f(any x): if x is string: return g(x) else: return g(x)
Version 2:
int f(any x): return g(x)
The question is: are the above two examples equivalent or not? (answer below)
Well, the answer is: no, they’re not equivalent. You probably guessed that, but the question is: why not? Well, consider these two definitions for the function g()
used in the examples:
int g(any x): return 1 int g(string x): return 2
In the first example, after the type test x is string
Whiley’s flow-type system automatically retypes variable x
to have type string
. Therefore, the call g(x)
on the true branch is statically dispatched to g(string)
, whilst on the false branch it dispatches to g(any)
.
Anyhow, this puzzler is an interesting (and perhaps surprising) artifact of the choice to employ flow typing in Whiley. I wonder what other artifacts we’ll find…
If f didn’t return g(x) but simply modified x, what would x be after the if statement?
Hi Orbitz,
Well, suppose it was like this:
Then, the type of
x
would be[int] | int
— that is, it has the type which is the union of an int and an int list.Can I do anything with an [int] | int that is different than if the type was still ‘any’?
Well, there are a few things. Firstly, the following is a syntax error:
But, if
x
had typeany
then it would be OK. Also, we can write this:If
x
had type any, then the final return would give a syntax error because it woudn’t guaranteed to be anint
.Using ‘if’ seems so weird. Why didn’t you go with a matching syntax instead? match type(x) with String(x) -> blah | _ -> blargh
Hmmm, well it seems natural to me. Like instanceof in Java. But I do want to support pattern matching with the “is” type test operator.
How does ‘is’ give you pattern matching? A match operation gives you a much more complete solution and you don’t even need flow typing, you test the type of something and get the value back. You also get benefits like exhaustiveness and value matching. Not to mention boolean blindness.
Well, I could write e.g.
which matches x as a tuple and loads the elements into y and z. I can do a similar thing with e.g. Lists to get the head and tail etc. it’s perhaps not as elegant as pattern matching e.g. In Haskell … but it fits with the imperative style I’m going for. I’m not quite sure what boolean blindness is, but I’m guessing its that the match doesn’t return a boolean? That doesn’t really make much difference though as in e.g. Haskell you often have to add an additional if condition anyway…
http://existentialtype.wordpress.com/2011/03/15/boolean-blindness/
I’m now wondering if flow typing makes overloading methods an even worse idea than usual.
For example:
If at any point in the upstream evaluation x was proven to be a string, the return type changes. If, during a refactor, a new method is inserted which coincidentally proves x is a string as part of its process, the return type of a method downstream may change. That could be especially problematic with externally maintained code (libraries).
Are return types in Whiley used, in part, for reambiguation of types? This would better enforce contracts and eliminate the library problem I mentioned before.
Also, how are refinement types expected to interact with overloading? Are they disallowed?
One can envisage:
Even before arriving at the potential problems of differing return types, how is the method call determined, given that an input might be int&nat&posint&even? (Compiler intervention might be the only sane response.)
Hey James,
No, it currently only considers parameters — just like Java.
It is plausible that flow typing could make things more brittle to changes, but I’m not sure how much of an issue it really is. Certainly, the following is interesting:
Then, any change to the return type of
f(..)
could change the method thatg(x)
dispatched to. But, on the understanding that the different versions ofg(x)
are designed to handle the different types, then this makes sense.Yes, in principle, constrained types do interact in the manner you describe. At the moment, this doesn’t work. But, it’s not that difficult to do as, effectively, it’s all done with syntactic sugar. For example:
Then, this expands to the following:
In fact, I'll probably split the 1st and 2nd bodies into separate methods (e.g. with a mangle) so that I can dispatch to them directly in cases where we know the parameter is e.g. a
pos
.But expanding f to that looses information. Consider if you have your pos/neg f’s and void f(int x) as well. How do you disambiguate that?
Hi Orbitz,
So, one approach is to prioritise overloaded functions by the order in which they are seen. Consider:
Using the above rule, this expands to:
However, if we’d declared the two functions the other way around then we’d get:
Now, I agree that this approach is a little iffy — my goal is only to demonstrate that it can be resolved.
Perhaps the mechanism that I want is to make a requirement that the domains of overloaded functions are mutually exclusive. Using the verifier we could prove that f(int) and f(pos) overlap … hence, that would be a syntax error.
I think that solution still misses the point. If pos, neg, and int are distinct types then it should be an error to call f if it only as pos/neg overloads with an int. Otherwise what is the point in making them types? Also, are (-) and (+) guaranteed to give me the same integer type back? If not, I think that is highly dubious.
edit: should be a compile-time error.
Ah, right. Yes, the verifier will report a syntax error as it will spot that the supplied argument could be zero. E.g. in this case:
I just punched this into my compiler, and this is what it said:
FYI, the compiler is using an SMT based automated theorem prover to check these constraints.
But I can’t guarantee that if f has overloads for types that overlap each other the correct one gets called, correct?
Curious, what does the following code do?
pos x = 10
f(x - 20)
Well, we need a rule to disambiguate. The best one is to require that the verifier to proves that types do not overlap, otherwise it reports a syntax error.
So, let’s write it out in full:
So, this all looks fine and
f(x - 20)
will dispatch to thef(neg)
case. In essence, this is automatic because of the sugaring … that is, we dispatch to f(int) and then a conditional chooses at runtime which case is called.My (vaporware) language has a simple requirement: if multiple definitions for a function are given, then they must agree in all places they overlap. So, the example you gave would be a compile error, because string is a subtype of any but g gives different results.
f is then rather pointless, but I suppose you could do other things than call g.
One nice thing this does is make unit testing built-in to the language; just give the test cases as alternative definitions: (I use Haskell syntax because it’s less verbose)
f (a :: Int) = ... -- general function
f 1 = 5 -- test case
Written in Whiley it would look something like this:
int f(int x):
return ...
int f('1 x):
return 5
(where ‘1 is the singleton type containing 1)
I suppose you could also introduce an explicit “test” keyword, but these are syntactical issues.
I envisioned it as also allowing multiple general implementations, e.g. a fast (but possibly incorrect) algorithm and a slow, correct algorithm, but proving program equivalence is undecidable so it would be “difficult” to implement (thus why it’s vaporware; I’m still learning a lot about dependent types). A rule system that can break ties would be interesting; but designing that sort of thing is hard, and putting user-defined rules into play doesn’t help. But I’m certain something could be figured out.
Hey Mathnerd,
That sounds …. quite tricky to implement? Right, using some kind of verifier as I’m using in Whiley would help … but it would still be a struggle I think?
I heard of some work which something similar to that:
http://www.doc.ic.ac.uk/~pcc03/eurosys11klee.pdf
But, I think there has been quite a bit of work on program equivalence.
It’s certainly hard to implement statically. A dynamic implementation that ran all possibilities and then compared them would be easy but slow.