Skip to main content


I've lately been working on my disertation concering a program transformation technique known as Distillation. Distillation is a technique inspired by supercompilation.

Supercompilation involves mapping terms from a particular language to other terms of the same language. It can be though of as a semantics preserving rewriting system, or a function which maps terms to other terms which are equal modulo some semantic interpretation relation. A really strait-forward description of supercompilation is given by XXX on his paper on supercompilation for Haskell. This work leverages work by Gluck and Sorenson but presents generalisation in a very clear and declarative way, which is a notable exception to most of the literature in this field. Generalisation is actually quite a tricky problem. Generalisation has some of the same problems as the old problem with standards. That is, the thing about standards is that there are so many to chose from. It takes quite a bit of effort to come up with not only a reasonable formulation of a generalisation. But one that has the characteristics that you want. One that doesn't exhibit over-generalisation.

Supercompilation is a particular meta-compilation technique. In fact there are loads of meta-compilation techniques and they can be quite powerful. Partial-evaluation is the most well known technique and one of the simplest. Deforestation is another well known technique. Geoff Hamilton described a much more sophisticated technique known as Distillation.

Turchin noted that supercompilation had the capacity to prove conjectures through the transformation of semantically equivalent programs. Basically by reducing applications of predicates to their truth value. More sophisticated techniques of program transformation can lead to even more sophisticated automated solutions to conjectures.

In my research I've decided to leverage the automated proof capacity of the particular metacompilation strategy "distillation" to solve problems in the domain of reactive systems. Reactive systems are particularly important since they represent a class of systems that is immanently useful, yet much less studied. They are programs which must respond to stimulae. They basically encompass all programs which deal with actions provided by an external environment.

The particular objective is to be able to make specifications in some temporal logic (posibly LTL initially, but maybe the mu-calculus depending on circumstances) and prove correctness of the specifications using program transformation in an automated fashion. Some work on this has already been done. Most notably by Leuschel and XXX (paper on crypto). There are a number of obvious sticking points that I've run into.

The representation of runs of a machine is crucial to the notion of verification using some temporal logic. For LTL you need only have a stream of possible runs. Because the specifications that will be given to the compiler will represent programs with a precondition, that act as state tranformers given some particular satisfaction, it is easiest to encode this as a state transformer for each operation that satisfies the precondition. That is, the "stream" of values becomes a "tree" of values where each possible output is contingent on the satisfaction of some precondition predicate.

This representation mirrors the types of proofs that are done over inductive predicates in a language such as coq for temporal properties, and this proof representation was inspirational.

The tree of possible values contingent on the state and predicate over states is a coinductive function. It presents an infinite number of possible states over which proofs must function. This should present no difficulties in a system which incorporates inductive and coinductive types.

The difficulties in the representation become clear when one tries to prove safety properties over this representation. Safety properties over a coinductively defined instruction sequence are necessarily coinductive. Supposing we create a coinductive list-type, such that we have a cons constructor, in addition to nil. Now if we have the function:

from x = x:(from (S x))

If (S x) is the successor function, we have a program that computes the largest set of members of the co-lists starting from x. Supposing we have a predicate All : (P : Set) -> co-list -> bool, which returns true when ever P is true on every element of co-list. This naturally corresponds to a safety property. We can now ask things like:

All (x > 3) (from 4)

That statement requires an infinite proof! We need to unfold "from" an infinite number of times. All needs unfolding an infinite number of times as well. Fortunately we can procede using coinduction.


Popular posts from this blog

Decidable Equality in Agda

So I've been playing with typing various things in System-F which previously I had left with auxiliary well-formedness conditions. This includes substitutions and contexts, both of which are interesting to have well typed versions of. Since I've been learning Agda, it seemed sensible to carry out this work in that language, as there is nothing like a problem to help you learn a language.

In the course of proving properties, I ran into the age old problem of showing that equivalence is decidable between two objects. In this particular case, I need to be able to show the decidability of equality over types in System F in order to have formation rules for variable contexts. We'd like a context Γ to have (x:A) only if (x:B) does not occur in Γ when (A ≠ B). For us to have statements about whether two types are equal or not, we're going to need to be able to decide if that's true using a terminating procedure.

And so we arrive at our story. In Coq, equality is som…

Formalisation of Tables in a Dependent Language

I've had an idea kicking about in my head for a while of making query plans explicit in SQL in such a way that one can be assured that the query plan corresponds to the SQL statement desired. The idea is something like a Curry-Howard in a relational setting. One could infer the plan from the SQL, the SQL from the plan, or do a sort of "type-checking" to make sure that the plan corresponds to the SQL.

The devil is always in the details however. When I started looking at the primitives that I would need, it turns out that the low level table joining operations are actually not that far from primitive SQL statement themselves. I decided to go ahead and formalise some of what would be necessary in Agda in order get a better feel for the types of objects I would need and the laws which would be required to demonstrate that a plan corresponded with a statement.

Dependent types are very powerful and give you plenty of rope to hang yourself. It's always something of…

Plotkin, the LGG and the MGU

Legend has it that a million years ago Plotkin was talking to his professor Popplestone, who said that unification (finding the most general unifier or the MGU) might have an interesting dual, and that Plotkin should find it. It turns out that the dual *is* interesting and it is known as the Least General Generalisation (LGG). Plotkin apparently described both the LGG for terms, and for clauses. I say apparently because I can't find his paper on-line.

The LGG for clauses is more complicated so we'll get back to it after we look at the LGG of terms. We can see how the MGU is related to the LGG by looking at a couple of examples and the above image. We use the prolog convention that function symbols start with lower case, and variables start with uppercase. The image above is organised as a DAG (Directed Acyclic Graph). DAGs are a very important structure in mathematics since DAGs are lattices.

Essentially what we have done is drawn an (incomplete) Hasse diagram f…