Articles

  • No categories

Modular Purity Analysis for Java

Well, after an agonizing wait I finally heard the news that my paper on purity analysis was accepted to the Conference on Compiler Construction, 2011.  Obviously, I’m stoked!  The paper is:

  • JPure: a Modular Purity System for Java. David J. Pearce. In Proceedings of the Conference on Compiler Construction, 2011. [PDF]

A [[Pure function|pure function]] is one which has no observable [[Side effect (computer science)|side-effects]].  For example, it cannot modify any state that existed prior to the function being called.  Similarly, it may not read or write from I/O.  There are lots of advantages from knowing which functions are pure.  For example, it allows the compiler (or bytecode optimiser) to perform certain optimisations that would otherwise be considered unsafe (see e.g. this, this and this).  Purity information can also help with automatic parallelisation (e.g. this and this), and software verification (e.g. this and this).

Right, so pure functions are useful … but, how do they work?  A simple way of including purity in a language like Java is through annotations.  Here, we annotate methods with @Pure to signal they are pure functions.  Then, we want a type checker for the annotations, to ensure we use them correctly.  Typically this would operate by examining the body of every method marked @Pure, and checking:

  1. For every invocation, the static type of the method being invoked is marked @Pure.
  2. The method makes no field assignments.
  3. If the method overrides a method annotated @Pure, then it is also annotated @Pure.

These rules ensure that the method has no side-effects, provided the @Pure annotations on other methods are themselves correct.  However, these rules are extremely restrictive.  Consider a very simple Java method:

class Example {
 private ArrayList<String> items = ...;

 boolean contains(String item) {
    for(String s : items) {
       if(s.equals(item)) { return true; }
    }
   return false;
}}

This method cannot be annotated @Pure, even though it clearly is a pure function.  Why?  Well, because it uses an Iterator object to traverse items, and Iterator.next() cannot be annotated @Pure (since implementations of Iterator must be able update their internal state here).

However, whilst the contains() method above is not considered pure under rules (1-3), it does not break our original requirements for purity (given at the top of this article).  That’s because contains() doesn’t modify any state that existed before the method was called — only state created during its execution.  All we need to do is extend our rules for checking purity to allow for this …

… which is exactly what my paper is all about.  But, I won’t spoil it for you here … it’s all in the paper!!

8 comments to Modular Purity Analysis for Java

  • Congratulations Dave! – very nice stuff.

  • Jens

    Dosen’t you definition of purity miss a fundamental part? The result of a pure function can not depent on any mutable global state, so it can not read any mutable fields.

  • Hi Jens,

    The result of a pure function can not depent on any mutable global state, so it can not read any mutable fields.

    Yes and no. Yes, a pure function cannot depend on any mutable global state. But, yes, it can read mutable fields — provided it knows the object in question is not global state. Instead, the object must be freshly allocated within the function.

  • Jens

    Hm. But your rules above does not mention anything about this? How can they ensure that the method is pure? (you say that they ensure that the method is side-effect free, mayby in missunderstand)

  • Hey Jens,

    Well, I’ve really glossed over the details of how it works above. The system uses static analysis to check purity within a method, given that the programmer has annotated methods with @Pure, and also identified @Local state. The paper has much more details — I’d recommend looking at that. In the paper, I also discuss a mechanism for inferring the annotations directly from source code — which works pretty well. The idea is that to annotate a legacy code base, you’d start with the automatic inference. Then, you’d review what it produced and possibly tweak it. Finally, when you changed the code base, you’d then update the annotations by hand.

  • UPDATE — I fixed the broken link to the paper now.

  • Theodore R

    This is excellent work. I have always wondered about this, and asked about this in SO, and they pointed me in this direction. http://stackoverflow.com/questions/38672241/marking-methods-which-do-not-read-write-instance-state/38672836#38672836

    This is such an important feature, that it probably should be a core Java language feature. Thanks so much for the work. Hope it gets into the language at some point.

    Might you be looking at anything like hierarchical package privacy? I would love it if 90% of the classes inside com.xyz.orders are visible only to classes in sub-packages. For example com.xyz.orders.bo would be able to see com.xyz.orders.dao. But, all the classes in com.xyz.orders.dao would be inaccessible to classes not under com.xyz.orders.

  • Theodore R

    There is a related question which I posted here, although it is not that very clear: http://stackoverflow.com/questions/23995683/improvement-of-package-private-classes-in-java

Leave a Reply

 

 

 

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>