As the Whiley compiler continues to evolve, certain aspects of its architecture are really starting to mature. One of the more recent pieces to take shape is the verification pipeline. This is the process by which a Whiley file is converted into a series of verification conditions, which are then checked by the automated theorem prover.
A high level overview of the architecture is the following:
Here, we see some of the main components in the Whiley compiler:
- The Whiley Compiler (WyC) — The compiler front-end, responsible for parsing and type checking whiley
sourcefiles and generating (binary)
- The Whiley Intermediate Language (WyIL) — A register-based intermediate language, similar in some ways to JVM bytecode. This essentially provides a simplified form of a
whileysource file which is more manageable, and where many intricate aspects are already completed (e.g. name resolution, typing, desugaring, etc).
- The Whiley Constraint Solver (WyCS) — The automated theorem prover responsible for checking whether a given formula in first-order logic is satisfiable or not. These verification conditions are represented as (binary)
wycsfiles. A human-readable file format is provided in the form of
wyalfiles, which can be “compiled” to
In this post, I’m just going to focus on the flow from
whiley source files to the
wycs files which are passed to the theorem prover. One of the really nice aspects of the compiler architecture is that we can see these files in a human-readable form at each step.
Compiling Whiley to WyIL
The first step of the process is to translate
whiley source files into binary
wyil files, which are analogous to JVM
class files. Here’s our simple Whiley function which we’re going to compile and verify:
int abs(int x) ensures $ >= 0: if x >= 0: return x else: return -x
This defines the absolute function which returns the positive image of any integer value. The
ensures clause defines a postcondition which states that the value returned (given by
$) is non-negative. If we compile this function into the WyIL form, we’ll get something like this:
int abs(int): ensures: const %3 = 0 : int assertge %0, %3 "postcondition not satisfied" : int body: const %2 = 0 : int iflt %0, %2 goto label0 : int return %0 : int .label0 neg %5 = %0 : int return %5 : int
As we can see, the Whiley Intermediate Language is a register-based bytecode with unstructured control-flow. Here, registers are prefixed with
const bytecode loads a constant value into a register; the
iflt bytecode branches to a label if its first operand is less than its second; the
neg bytecode negates its operand and assigns to a given register; finally, the
return bytecode returns its operand.
There are several advantages of the WyIL form over Whiley source code: firstly, all names have been resolved using their context (i.e. combination of enclosing function and imports); secondly, all types have been propagated through expressions (and checked); thirdly, the number of constructs at the WyIL level is far fewer than at the source level (e.g. conditional branches and goto bytecodes can encode a wide range of source-level control-flow constructs). This latter point is particularly important, as it helps decouple the language syntax from the compiler back-ends and other optimisation passes (i.e. changes to the syntax do not usually require changes to the WyIL).
Compiling WyIL to WyAL
The Whiley Assertion Language (WyAL) provides a useful interface between the WyIL bytecode programs and the automated theorem prover (WyCS). The
wyal source files provide a human-readable dialect of first-order logic with various additional operations (e.g. for arithmetic, sets, lists, etc). As an example, consider the following WyAL code:
assert: forall(int x): if: x > 0 then: x >= 0
Here, we see a very simple logical constraint which asserts that
x > 0 implies
x >= 0 (which we know is true). Putting this assertion through the constraint solver will not produce any assertion errors (indicating the theorem prover believes the assertion holds). The WyAL language bears some resemblances to the input languages for Z3 (i.e. SMT-LIB) and Yices.
To verify a Whiley source file is correct, it is first translated into a WyIL file, then into a WyAL file which, finally, is verified as correct (or not) by the WyCS theorem prover. Each Whiley function can generate multiple assertions (a.k.a verification conditions) in the generated WyAL file. For our running
abs(int) example two assertions are generated: one for the execution path going down the
true branch; and the other going down the
false branch. In fact, our simple WyAL assertion above represents the verification condition for the
true branch of our running example.
Hopefully this article has given a little insight into the mechanism for verifying a Whiley source file is correct. Of course, a lot of questions are left unanswered and, unfortunately, many aspects of the mechanism remain undocumented (for now). For example, the WyAL language is completely undocumented at this stage. However, over the coming weeks, months and years I hope to document more of the compiler and how it works.