The State of Whiley
The aim of this post is simply to list the main outstanding issues with the design and implementation of Whiley. This is primarily for my own purposes, in order to help me focus my efforts and to ensure I don’t forget something important.
Syntax
Deciding upon the language syntax is obviously the highest priority for me right now. Over the last six months, a lot of the syntax has been locked down. However, there remains a number of issues (some of which are thorny).
Message Send Syntax. Currently, the syntax for sending messages to actors is not finalised. This post provides all of the interesting details.
Implicit Coercions. These are causing me something of a headache as, in the context of structural subtyping, they’re quite tricky. This post provides some insight into the problem, as does this post.
Default Imports. Currently, like Java, every Whiley source file implicitly imports
whiley.lang.*
which (I think) is a bad idea. The reason for this is that it doesn’t make sense to unnecessarily clutter the namespace of every source file. On the other hand, it is annoying to have to include that import explicitly, as most files will probably want it.Imports. Currently, when you import a module, all names from that module are imported directly. For example, suppose a module
String
contains a functionindexOf()
. The statementimport String
will import the nameindexOf
directly into our namespace. This is convenient, but opens up potential for name clashes. It might be better forimport String
to import the nameString.indexOf
rather than justindexOf
(like Python).Encapsulation. Currently, there is no real support for encapsulation modifiers (e.g.
public
,private
, etc). I plan to supportpublic
andprivate
in the obvious ways. However, I also want a third option. Consider the definitionpublic define Rec as {int x}
. This makes the name and its contents visible to clients. But, I also want to be able to export the name without its contents. The two options I’m considering are: usingprotected
for this to be used in place ofpublic
; or, providing a specialexport
declaration. The idea of the latter is that, for aprivate
type, we can declare something likeexport Rec
— this just makes the nameRec
visible to clients, and nothing else. I’m leaning towardsprotected
as it’s simpler, but it would cause some confusion with the meaning ofprotected
in other languages.Interface Syntax. Currently, there is no way to declare an
interface
, such as e.g.Reader
in Java. This is a serious deficiency which I will rectify as soon as I figure out the best way of doing it.Bindings. Currently, every function or method compiled to JVM Bytecode has its name mangled … which makes it impossible to call such functions/methods from native Java. I am considering providing a “binding” modifier for methods/functions. This would ensure a version of the method/function existed which was unmangled.
Entry Point. Currently, the entry point for a Whiley program is
void System::main([string] args)
. This is problematic as it means that, until the main method exits, no actor can send a message to theSystem
process. One option is to support “headless” methods in Whiley — then, the signature would become e.g.void ::main(``System sys, ``[string] args)
. In fact, the command-line arguments could be held insys
to further simplify this.Currying. Currently, you can take the address of a function (e.g.
&f
). However, you cannot curry functions. So, I’d like to support syntax such as e.g.&f(2,_)
, where the underscore indicates the actual parameter.Actor Fields. Currently, you do not need to provide
this
when reading a field fromthis
actor (unless a local variable of the same name already exists). However, when assigning fields you do need to providethis
(this post discusses why) Likewise, you currently need to providethis
when making an internal message send (however, I could infer it in most cases). I’m wondering whether or not requiringthis
in all cases would be more a uniform approach, and give better readability.String Conversions. Currently, the system does not implicitly convert data types into strings. Instead, you must explicitly call
str()
. For example, to print an integeri
, you would writeout.println(str(i))
. This seems a little cumbersome, and somewhat unnecessary.
Implementation
The current implementation targets the Java Virtual Machine and, for the most part, is in good shape. However, there are a number of outstanding issues.
Type System. My work on formalising Whiley’s type system has thrown up a number of interesting issues. In particular: types are not currently represented in a canonical fashion; they do not support distributivity across records/function types; and, ideally, they would include intersection and negation as first-class components.
Dictionary and String Iteration. You cannot at the moment iterate over dictionaries or strings using the for loop. This is a minor issue to fix.
Break/Continue Statements. Break and continue statements are only semi-working right now. Again, only a minor issue.
Exceptions. At the moment you can
throw
exceptions, but you cannotcatch
them! This should be easy to fix. I need to consider whether or not afinally
block is required.Actors. At the moment, Actors are currently implemented using a single
Thread
per actor. This is highly inefficient, and eventually we will support green threads.Native function/method declarations. I will provide a
native
modifier for functions/methods to indicate they are provided by the system.Whiley Bytecode File Format. There is no binary file format for wyil files. Thus, all signature information is currently stored directly in the Java classfile. Eventually, however, such information will be stored in a binary wyil file (similar, in some ways, to a header file in C). Then, the WHILEYPATH will only be searched for wyil files, not Java classfiles. The primary reason for this is to support compilation to non-JVM targets, such as machine code.****
Record Access Bytecodes. Currently a record access requires a specific type (i.e.
Type.Record
) which is generated fromeffectiveRecordType()
. This is a problem as the back-end loses knowledge of the actual representation of types. For example,{int x}|{[int] x}
is converted byeffectiveRecordType()
into{int|[int] x}
to be stored in aFieldAccess
bytecode. This presents a problem if {int x} is represented differently from{int|[int] x}
(which it might be in the case ofint
being implemented using a nativeint
).
And, of course, there are quite a few outstanding bugs yet to be fixed!
Performance
Performance is a significant issue in Whiley at the moment. It’s unlikely that this will change in the near future, however eventually I will address it. There are currently three main techniques that I will use to address this:
Reference Counting. Here, the idea is that every compound structure (e.g. a list) maintains a reference count. Then, when an update is made to a compound structure, the count is checked. If its
> 1
the structure is cloned; otherwise, the update occurs in place. See this paper for more on how this might work.Integer Range Analysis. A significant challenge in Whiley (from a performance perspective) is that the
int
type represents unbounded integers which are currently implemented as aBigInteger
. However, when possible, we want to use native JVM ints. In order to do this, I propose using an integer range analysis to determine the lower and upper bounds of a given type. This would be performed intra-procedurally and, in conjunction with pre/post-conditions, it should be very effective. See this link for relevant papers.Native Lists/Sets. Currently, lists of e.g. bytes are implemented as e.g.
ArrayList<Byte>
. This is not efficient and I want to provide specific list implementations which (in this case) would usebyte[]
internally.Representation of Records. Currently records are represented by
HashMap
. However, it would be more efficient to represent them asObject[]
, giving guaranteed constant-time lookup.Tagged Unions. In some cases, a tagged union could speed up runtime type tests.