Whiley is a hybrid object-oriented and functional programming language. Whiley employs extended static checking to eliminate errors at compile time, including divide-by-zero, array out-of-bounds and null dereference errors. Extended static checking is made possible through the use of an automated theorem prover. Whiley compiles to the Java Virtual Machine and is fully inter-operable with existing Java applications.

A simple Whiley program illustrates the main ideas:

function abs(int x) -> (int r)
ensures r >= 0
ensures r == x || r == -x:
    if x >= 0:
       return x
       return -x

The above function has a post-condition requirement that the return value cannot be negative, and matches either the parameter x or its negation. The Whiley compiler would reject at compile-time any implementation which did not meet this specification.


Numerous important software systems have failed due to program bugs. Historic examples include the Therac-25 disaster where a computer-operated X-ray machine gave lethal doses to patients, the 1988 worm which wreaked havoc on the internet by exploiting a buffer overrun, and the (unmanned) Ariane 5 rocket which exploded shortly after launch because of an integer overflow (see this video and this list for more).

To address this, Prof. Sir Tony Hoare (ACM Turing Award Winner, FRS) recently proposed the creation of a verifying compiler as a grand challenge for computer science. A verifying compiler:

“uses mathematical and logical reasoning to check the correctness of the programs that it compiles”.

Whiley is an attempt to tackle the verifying compiler challenge. Whiley was developed from scratch because existing languages are already extremely complex, and verifying their programs automatically seems intractable. Indeed, while attempts to construct verifying compilers for existing languages (e.g. Spec# and ESC/Java) have had some success, they remain a long way from completion.


Modern programming languages, such as Java and C#, eliminate fairly simple classes of error (so-called type errors) through the process of type checking; however, they cannot detect more complex problems, such as the potential for a divide-by-zero error. Type checking has been used for a long time in programming languages, historical examples of which include: ALGOL, Pascal, Modula-2, C, C++, Java, C# and more.

Numerous attempts have been made to improve upon the type-checking paradigm. Perhaps most notable was the Extended Static Checking system for Modula-3 (ESC/Modula-3), developed originally at the Compaq Systems Research Center. By 1996 this had reached the point where it became usable in practice. By that time, however, use of Modula-3 was already in significant decline. So, the ESC/Java project was begun in an effort to attract the growing number of Java programmers. The critical component underlying both of these systems was the Simplify theorem prover. This was perhaps one of the first example of an efficient, practical SMT Solver. ESC/Java was described in a seminal paper, and has been used for a variety of real-world tasks, including the Gemplus Smart Card project. However, despite these successes there remains significant issues with the tool. In particular, it is unsound in the presence of multithreading, ignores modulo arithmetic and makes simplifying assumptions regarding loops, aliasing and object initialisation.

Around the same time that ESC/Java was becoming a reality a related project, called the Java Modeling Language (JML), was born. The aim here was to provide a standardised language for writing method pre- and post-conditions, class invariants and more. The advantage of this approach is that it allows a multitude of different tools to be brought to bear on a project. For example, the jmlrac tool does not check conditions at compile time, but instead converts them into runtime checks. Whilst this is not as desirable as performing checks at compile time, it does provide a useful intermediate position.

More recently, Microsoft has begun the Spec# project. This was influenced by JML and Eiffel, and provides a formal language for API contracts. Spec# is primarily intended to be used as a vehicle for research in the area of verifying compilers, and builds on Boogie and the Z3 SMT solver. At the time of writing, the released version only supported runtime checking of pre- and post-conditions although compile checking was performed for some specific cases (e.g. non-null types).

The notion of design-by-contract, where programmers define formal and precise specifications for their software components, was originally coined by Bertrand Meyer for the Eiffel language. Eiffel was probably the first example of a widely-used language that supported pre- and post-conditions. However, in this case, the contracts were always checked at runtime (although efforts have been made to extend this to compile-time checking). Perhaps one key issue here, was the lack of quantifiers in the language definition. Nevertheless, Eiffel remains an excellent example of what can be achieved in this area, and it is still supported by Eiffel Software and used in numerous industrial applications.

Finally, there are several other notable languages in this area. SPARK/Ada has been used by Altran Praxis to develop both commercial and military aircraft software; and, the X10 lanugage, developed by IBM, supports dependent types which allow fine-grained constraints on program variables.

Language Design

The design of Whiley attempts to resolve many outstanding issues related to the verification of object-oriented languages. In particular, the following design choices are seen as critical to its success:

  • Traditional object-oriented systems do not distinguish values from objects, which presents a significant challenge for software verification. For this reason, Whiley adopts a hybrid approach which distinguishes functions from methods. A function can only accept values as parameters and also returns a value; this ensures that it is side-effect free. Whiley includes standard primitive types, including int and boolean; however, Whiley also provides primitive sets, lists, maps and tuples. Thus, functions can perform complex computations, and can also be used safely within pre- and post-conditions.
  • Modular arithmetic has proved problematic for software verifiers and, hence, Whiley uses unbounded arithmetic. That is, a variable declared as int is not limited to a specific range of values as determined, for example, by 32-bit twos-complement representation. Thus, in theory at least, an int can take on any possible integer value (although there remain some limitations imposed by the amount of available RAM).
  • Object aliasing has been a difficult issue for program analysis and verification, and a significant amount of research exists on the problem of pointer analysis. Whilst solutions to this problem have proven invaluable for reasoning about traditional languages, such as Java, they suffer from an inherent lack of modularity. Whiley works around this issue by significantly limiting the situations where aliasing can arise; in particular, whilst aliasing can arise within a method, it can never arise within a function. Methods (aka messages) are used to change the state of an object, whilst functions simply compute information. In the Whiley model, the number of objects within a running system is significantly fewer than in a language like Java (where almost everything is an object).
  • Many existing languages, including JML, permit pre- and post-conditions which are non-computable. The most common situation where this arises is in conjunction with quantifiers. For example, one might have a condition such as this:
    forall x,y,z:int [!exists n:int [n>2 && x^n=y^n+z^n]]

    The astute reader will recognise this as Fermat’s Last Theorem which, indeed, has been proved correct. One cannot expect automated theorem proving technology to reason correctly about such conditions! Whiley addresses this by enforcing its prime directive: every condition must be convertable into a runtime check and, hence, must be computable. Thus, Whiley employs a form of three-valued logic when checking conditions: definitely correct, definitely incorrect and unknown. The first indicates there is definitely no error at that program point; the second indicates there definitely is an error at that program point; whilst the third indicates Whiley could not definitely determine an answer. When this latter state arises, Whiley inserts a runtime check at that program point and produces a warning to indicate this.

These decisions have had a significant impact upon the design of the Whiley language. To learn more about the language, check out the documentation.