Articles

Whiley Puzzler #1

I was having an interesting discussion with a colleague today about various aspects of Whiley, and we came up with an interesting bit of example code which is something of a puzzler. Consider these two different versions of a function f(any):

Version 1:

int f(any x):
    if x is string:
        return g(x)
    else:
        return g(x)

Version 2:

int f(any x):
    return g(x)

The question is: are the above two examples equivalent or not? (answer below)

Well, the answer is: no, they’re not equivalent. You probably guessed that, but the question is: why not? Well, consider these two definitions for the function g() used in the examples:

int g(any x):
    return 1

int g(string x):
    return 2

In the first example, after the type test x is string Whiley’s flow-type system automatically retypes variable x to have type string. Therefore, the call g(x) on the true branch is statically dispatched to g(string), whilst on the false branch it dispatches to g(any).

Anyhow, this puzzler is an interesting (and perhaps surprising) artifact of the choice to employ flow typing in Whiley. I wonder what other artifacts we’ll find…

23 comments to Whiley Puzzler #1

  • If f didn’t return g(x) but simply modified x, what would x be after the if statement?

  • Hi Orbitz,

    Well, suppose it was like this:

    if x is String:
        x = [1,2,3]
    else:
        x = 1
    // what is x here?
    

    Then, the type of x would be [int] | int — that is, it has the type which is the union of an int and an int list.

  • Can I do anything with an [int] | int that is different than if the type was still ‘any’?

  • Well, there are a few things. Firstly, the following is a syntax error:

    void f( [int]|int x):
        if x is string:
            ...
    

    But, if x had type any then it would be OK. Also, we can write this:

    int f( [int]|int x):
        if x is [int]:
            return |x|
        else:
            return x
    

    If x had type any, then the final return would give a syntax error because it woudn’t guaranteed to be an int.

  • Using ‘if’ seems so weird. Why didn’t you go with a matching syntax instead? match type(x) with String(x) -> blah | _ -> blargh

  • Hmmm, well it seems natural to me. Like instanceof in Java. But I do want to support pattern matching with the “is” type test operator.

  • How does ‘is’ give you pattern matching? A match operation gives you a much more complete solution and you don’t even need flow typing, you test the type of something and get the value back. You also get benefits like exhaustiveness and value matching. Not to mention boolean blindness.

  • Well, I could write e.g.

    if x is (int y, int z):
    

    which matches x as a tuple and loads the elements into y and z. I can do a similar thing with e.g. Lists to get the head and tail etc. it’s perhaps not as elegant as pattern matching e.g. In Haskell … but it fits with the imperative style I’m going for. I’m not quite sure what boolean blindness is, but I’m guessing its that the match doesn’t return a boolean? That doesn’t really make much difference though as in e.g. Haskell you often have to add an additional if condition anyway…

  • JamesG

    I’m now wondering if flow typing makes overloading methods an even worse idea than usual.

    For example:

    int g(any x): return 1
    string g(string x): return “2″

    If at any point in the upstream evaluation x was proven to be a string, the return type changes. If, during a refactor, a new method is inserted which coincidentally proves x is a string as part of its process, the return type of a method downstream may change. That could be especially problematic with externally maintained code (libraries).

    Are return types in Whiley used, in part, for reambiguation of types? This would better enforce contracts and eliminate the library problem I mentioned before.

    Also, how are refinement types expected to interact with overloading? Are they disallowed?

    One can envisage:

    int g(int x): return -1
    int g(nat x): return 0
    int g(posint x): return 1
    int g(even x): return 2
    int g(odd x): return 3

    Even before arriving at the potential problems of differing return types, how is the method call determined, given that an input might be int&nat&posint&even? (Compiler intervention might be the only sane response.)

  • Hey James,

    Are return types in Whiley used, in part, for reambiguation of types? This would better enforce contracts and eliminate the library problem I mentioned before.

    No, it currently only considers parameters — just like Java.

    If, during a refactor, a new method is inserted which coincidentally proves x is a string as part of its process, the return type of a method downstream may change. That could be especially problematic with externally maintained code (libraries).

    It is plausible that flow typing could make things more brittle to changes, but I’m not sure how much of an issue it really is. Certainly, the following is interesting:

    x = f(..)
    g(x)
    

    Then, any change to the return type of f(..) could change the method that g(x) dispatched to. But, on the understanding that the different versions of g(x) are designed to handle the different types, then this makes sense.

    Yes, in principle, constrained types do interact in the manner you describe. At the moment, this doesn’t work. But, it’s not that difficult to do as, effectively, it’s all done with syntactic sugar. For example:

    define pos as int where $ > 0
    define neg as int where $ < 0
    
    void f(neg x):
        ... // 1st body
    
    void f(pos x):
        ... // 2nd body
    

    Then, this expands to the following:

    void f(int x) requires x >0 || x < 0:
        if x > 0:
            ... // 1st Body
        else:
            ... // 2nd Body
    

    In fact, I'll probably split the 1st and 2nd bodies into separate methods (e.g. with a mangle) so that I can dispatch to them directly in cases where we know the parameter is e.g. a pos.

  • But expanding f to that looses information. Consider if you have your pos/neg f’s and void f(int x) as well. How do you disambiguate that?

  • Hi Orbitz,

    Consider if you have your pos/neg f’s and void f(int x) as well. How do you disambiguate that?

    So, one approach is to prioritise overloaded functions by the order in which they are seen. Consider:

    void f(pos x):
        ... // 1st Body
    
    void f(int x):
        ... // 2nd Body
    

    Using the above rule, this expands to:

    void f(int x):
        if x > 0:
            .. // 1st body
        else:
            .. // 2nd body
    

    However, if we’d declared the two functions the other way around then we’d get:

    void f(int x):
         .. // 2nd body
    

    Now, I agree that this approach is a little iffy — my goal is only to demonstrate that it can be resolved.

    Perhaps the mechanism that I want is to make a requirement that the domains of overloaded functions are mutually exclusive. Using the verifier we could prove that f(int) and f(pos) overlap … hence, that would be a syntax error.

  • I think that solution still misses the point. If pos, neg, and int are distinct types then it should be an error to call f if it only as pos/neg overloads with an int. Otherwise what is the point in making them types? Also, are (-) and (+) guaranteed to give me the same integer type back? If not, I think that is highly dubious.

  • edit: should be a compile-time error.

  • be an error to call f if it only has pos/neg overloads with an int

    Ah, right. Yes, the verifier will report a syntax error as it will spot that the supplied argument could be zero. E.g. in this case:

    void f(int x) requires 0 < x || x > 0:
       skip
    
    void g(int x):
        f(x)
    

    I just punched this into my compiler, and this is what it said:

    ./test.whiley:5: precondition not satisfied
        f(x)
        ^^^^
    

    FYI, the compiler is using an SMT based automated theorem prover to check these constraints.

  • But I can’t guarantee that if f has overloads for types that overlap each other the correct one gets called, correct?

  • Curious, what does the following code do?


    pos x = 10
    f(x - 20)

  • But I can’t guarantee that if f has overloads for types that overlap each other the correct one gets called, correct?

    Well, we need a rule to disambiguate. The best one is to require that the verifier to proves that types do not overlap, otherwise it reports a syntax error.

  • Curious, what does the following code do?

    So, let’s write it out in full:

    void g(pos x):
        f(x - 20)
    
    void h():
        g(10)
    

    So, this all looks fine and f(x - 20) will dispatch to the f(neg) case. In essence, this is automatic because of the sugaring … that is, we dispatch to f(int) and then a conditional chooses at runtime which case is called.

  • Mathnerd314

    My (vaporware) language has a simple requirement: if multiple definitions for a function are given, then they must agree in all places they overlap. So, the example you gave would be a compile error, because string is a subtype of any but g gives different results.

    f is then rather pointless, but I suppose you could do other things than call g.

    One nice thing this does is make unit testing built-in to the language; just give the test cases as alternative definitions: (I use Haskell syntax because it’s less verbose)

    f (a :: Int) = ... -- general function
    f 1 = 5 -- test case

    Written in Whiley it would look something like this:

    int f(int x):
    return ...

    int f('1 x):
    return 5

    (where ’1 is the singleton type containing 1)

    I suppose you could also introduce an explicit “test” keyword, but these are syntactical issues.

    I envisioned it as also allowing multiple general implementations, e.g. a fast (but possibly incorrect) algorithm and a slow, correct algorithm, but proving program equivalence is undecidable so it would be “difficult” to implement (thus why it’s vaporware; I’m still learning a lot about dependent types). A rule system that can break ties would be interesting; but designing that sort of thing is hard, and putting user-defined rules into play doesn’t help. But I’m certain something could be figured out.

  • Hey Mathnerd,

    then they must agree in all places they overlap

    That sounds …. quite tricky to implement? Right, using some kind of verifier as I’m using in Whiley would help … but it would still be a struggle I think?

    I heard of some work which something similar to that:

    http://www.doc.ic.ac.uk/~pcc03/eurosys11klee.pdf

    But, I think there has been quite a bit of work on program equivalence.

  • Mathnerd314

    It’s certainly hard to implement statically. A dynamic implementation that ran all possibilities and then compared them would be easy but slow.

Leave a Reply

 

 

 

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>