The State of Whiley
The aim of this post is simply to list the main outstanding issues with the design and implementation of Whiley. This is primarily for my own purposes, in order to help me focus my efforts and to ensure I don’t forget something important.
Syntax
Deciding upon the language syntax is obviously the highest priority for me right now. Over the last six months, a lot of the syntax has been locked down. However, there remains a number of issues (some of which are thorny).
Message Send Syntax. Currently, the syntax for sending messages to actors is not finalised. This post provides all of the interesting details.
Implicit Coercions. These are causing me something of a headache as, in the context of structural subtyping, they’re quite tricky. This post provides some insight into the problem, as does this post.
Default Imports. Currently, like Java, every Whiley source file implicitly imports
whiley.lang.*which (I think) is a bad idea. The reason for this is that it doesn’t make sense to unnecessarily clutter the namespace of every source file. On the other hand, it is annoying to have to include that import explicitly, as most files will probably want it.Imports. Currently, when you import a module, all names from that module are imported directly. For example, suppose a module
Stringcontains a functionindexOf(). The statementimport Stringwill import the nameindexOfdirectly into our namespace. This is convenient, but opens up potential for name clashes. It might be better forimport Stringto import the nameString.indexOfrather than justindexOf(like Python).Encapsulation. Currently, there is no real support for encapsulation modifiers (e.g.
public,private, etc). I plan to supportpublicandprivatein the obvious ways. However, I also want a third option. Consider the definitionpublic define Rec as {int x}. This makes the name and its contents visible to clients. But, I also want to be able to export the name without its contents. The two options I’m considering are: usingprotectedfor this to be used in place ofpublic; or, providing a specialexportdeclaration. The idea of the latter is that, for aprivatetype, we can declare something likeexport Rec— this just makes the nameRecvisible to clients, and nothing else. I’m leaning towardsprotectedas it’s simpler, but it would cause some confusion with the meaning ofprotectedin other languages.Interface Syntax. Currently, there is no way to declare an
interface, such as e.g.Readerin Java. This is a serious deficiency which I will rectify as soon as I figure out the best way of doing it.Bindings. Currently, every function or method compiled to JVM Bytecode has its name mangled … which makes it impossible to call such functions/methods from native Java. I am considering providing a “binding” modifier for methods/functions. This would ensure a version of the method/function existed which was unmangled.
Entry Point. Currently, the entry point for a Whiley program is
void System::main([string] args). This is problematic as it means that, until the main method exits, no actor can send a message to theSystemprocess. One option is to support “headless” methods in Whiley — then, the signature would become e.g.void ::main(``System sys, ``[string] args). In fact, the command-line arguments could be held insysto further simplify this.Currying. Currently, you can take the address of a function (e.g.
&f). However, you cannot curry functions. So, I’d like to support syntax such as e.g.&f(2,_), where the underscore indicates the actual parameter.Actor Fields. Currently, you do not need to provide
thiswhen reading a field fromthisactor (unless a local variable of the same name already exists). However, when assigning fields you do need to providethis(this post discusses why) Likewise, you currently need to providethiswhen making an internal message send (however, I could infer it in most cases). I’m wondering whether or not requiringthisin all cases would be more a uniform approach, and give better readability.String Conversions. Currently, the system does not implicitly convert data types into strings. Instead, you must explicitly call
str(). For example, to print an integeri, you would writeout.println(str(i)). This seems a little cumbersome, and somewhat unnecessary.
Implementation
The current implementation targets the Java Virtual Machine and, for the most part, is in good shape. However, there are a number of outstanding issues.
Type System. My work on formalising Whiley’s type system has thrown up a number of interesting issues. In particular: types are not currently represented in a canonical fashion; they do not support distributivity across records/function types; and, ideally, they would include intersection and negation as first-class components.
Dictionary and String Iteration. You cannot at the moment iterate over dictionaries or strings using the for loop. This is a minor issue to fix.
Break/Continue Statements. Break and continue statements are only semi-working right now. Again, only a minor issue.
Exceptions. At the moment you can
throwexceptions, but you cannotcatchthem! This should be easy to fix. I need to consider whether or not afinallyblock is required.Actors. At the moment, Actors are currently implemented using a single
Threadper actor. This is highly inefficient, and eventually we will support green threads.Native function/method declarations. I will provide a
nativemodifier for functions/methods to indicate they are provided by the system.Whiley Bytecode File Format. There is no binary file format for wyil files. Thus, all signature information is currently stored directly in the Java classfile. Eventually, however, such information will be stored in a binary wyil file (similar, in some ways, to a header file in C). Then, the WHILEYPATH will only be searched for wyil files, not Java classfiles. The primary reason for this is to support compilation to non-JVM targets, such as machine code.****
Record Access Bytecodes. Currently a record access requires a specific type (i.e.
Type.Record) which is generated fromeffectiveRecordType(). This is a problem as the back-end loses knowledge of the actual representation of types. For example,{int x}|{[int] x}is converted byeffectiveRecordType()into{int|[int] x}to be stored in aFieldAccessbytecode. This presents a problem if {int x} is represented differently from{int|[int] x}(which it might be in the case ofintbeing implemented using a nativeint).
And, of course, there are quite a few outstanding bugs yet to be fixed!
Performance
Performance is a significant issue in Whiley at the moment. It’s unlikely that this will change in the near future, however eventually I will address it. There are currently three main techniques that I will use to address this:
Reference Counting. Here, the idea is that every compound structure (e.g. a list) maintains a reference count. Then, when an update is made to a compound structure, the count is checked. If its
> 1the structure is cloned; otherwise, the update occurs in place. See this paper for more on how this might work.Integer Range Analysis. A significant challenge in Whiley (from a performance perspective) is that the
inttype represents unbounded integers which are currently implemented as aBigInteger. However, when possible, we want to use native JVM ints. In order to do this, I propose using an integer range analysis to determine the lower and upper bounds of a given type. This would be performed intra-procedurally and, in conjunction with pre/post-conditions, it should be very effective. See this link for relevant papers.Native Lists/Sets. Currently, lists of e.g. bytes are implemented as e.g.
ArrayList<Byte>. This is not efficient and I want to provide specific list implementations which (in this case) would usebyte[]internally.Representation of Records. Currently records are represented by
HashMap. However, it would be more efficient to represent them asObject[], giving guaranteed constant-time lookup.Tagged Unions. In some cases, a tagged union could speed up runtime type tests.
