Friday, 18 January 2008

Supercompiltation

I've lately been working on my disertation concering a program transformation technique known as Distillation. Distillation is a technique inspired by supercompilation.

Supercompilation involves mapping terms from a particular language to other terms of the same language. It can be though of as a semantics preserving rewriting system, or a function which maps terms to other terms which are equal modulo some semantic interpretation relation. A really strait-forward description of supercompilation is given by XXX on his paper on supercompilation for Haskell. This work leverages work by Gluck and Sorenson but presents generalisation in a very clear and declarative way, which is a notable exception to most of the literature in this field. Generalisation is actually quite a tricky problem. Generalisation has some of the same problems as the old problem with standards. That is, the thing about standards is that there are so many to chose from. It takes quite a bit of effort to come up with not only a reasonable formulation of a generalisation. But one that has the characteristics that you want. One that doesn't exhibit over-generalisation.

Supercompilation is a particular meta-compilation technique. In fact there are loads of meta-compilation techniques and they can be quite powerful. Partial-evaluation is the most well known technique and one of the simplest. Deforestation is another well known technique. Geoff Hamilton described a much more sophisticated technique known as Distillation.

Turchin noted that supercompilation had the capacity to prove conjectures through the transformation of semantically equivalent programs. Basically by reducing applications of predicates to their truth value. More sophisticated techniques of program transformation can lead to even more sophisticated automated solutions to conjectures.

In my research I've decided to leverage the automated proof capacity of the particular metacompilation strategy "distillation" to solve problems in the domain of reactive systems. Reactive systems are particularly important since they represent a class of systems that is immanently useful, yet much less studied. They are programs which must respond to stimulae. They basically encompass all programs which deal with actions provided by an external environment.

The particular objective is to be able to make specifications in some temporal logic (posibly LTL initially, but maybe the mu-calculus depending on circumstances) and prove correctness of the specifications using program transformation in an automated fashion. Some work on this has already been done. Most notably by Leuschel and XXX (paper on crypto). There are a number of obvious sticking points that I've run into.

The representation of runs of a machine is crucial to the notion of verification using some temporal logic. For LTL you need only have a stream of possible runs. Because the specifications that will be given to the compiler will represent programs with a precondition, that act as state tranformers given some particular satisfaction, it is easiest to encode this as a state transformer for each operation that satisfies the precondition. That is, the "stream" of values becomes a "tree" of values where each possible output is contingent on the satisfaction of some precondition predicate.

This representation mirrors the types of proofs that are done over inductive predicates in a language such as coq for temporal properties, and this proof representation was inspirational.

The tree of possible values contingent on the state and predicate over states is a coinductive function. It presents an infinite number of possible states over which proofs must function. This should present no difficulties in a system which incorporates inductive and coinductive types.

The difficulties in the representation become clear when one tries to prove safety properties over this representation. Safety properties over a coinductively defined instruction sequence are necessarily coinductive. Supposing we create a coinductive list-type, such that we have a cons constructor, in addition to nil. Now if we have the function:

from x = x:(from (S x))

If (S x) is the successor function, we have a program that computes the largest set of members of the co-lists starting from x. Supposing we have a predicate All : (P : Set) -> co-list -> bool, which returns true when ever P is true on every element of co-list. This naturally corresponds to a safety property. We can now ask things like:

All (x > 3) (from 4)

That statement requires an infinite proof! We need to unfold "from" an infinite number of times. All needs unfolding an infinite number of times as well. Fortunately we can procede using coinduction.