Skip to main content

The type says everything

I've mentioned a number of times that given a sufficiently rich type theory we can use the type to provide a specification for the function, such that the function is correct by construction. I'd like to give a simple example of how this works in the Coq proof assistant.

The following code is an illustration of how the type theory of Coq allows you to fully specify the behaviour of functions. Kragen Sitaker made the comment that there weren't too many reasonable interpretations of the function of the type given to the assoc function:

A -> list (A * B) -> option B 

that is, a function taking an element of type A, a list of pairs of A and B, to the possibility of a B (and possible failure). That got me wondering how hard it would be to tighten the noose so that there was only one way to interpret the type but still possibly many implementations that satisfy the type, all of which will yield the same result.

The following code illustrates that this is indeed possible to do in Coq.

Require Import List. 
Parameter A : Set.
Parameter B : Set. 
Definition assoc_list := list (A * B)%type. 
Parameter eq_dec : forall (x:A) (y:A),{x=y}+{x<>y}.

Theorem assoc : 
  forall (x:A) (l: assoc_list), 
    {y | In (x,y) l} + {forall (y:B), ~In (x,y) l}.
    (fix assoc (x:A) (l:assoc_list) {struct l} 
      : {y | In (x,y) l} + {forall (y:B), ~In (x,y) l}
      := match l 
           return {y | In (x,y) l} + {forall (y:B), ~In (x,y) l}
           | nil => inright _ (fun y => (fun p : In (x,y) nil => _))
           | ((x',y)::t) => 
             match eq_dec x x' with
               | left lft => inleft _ (exist _ y _)
               | right rgt => match assoc x t with 
                                | inleft (exist y p) => inleft _ (exist _ y _)
                                | inright inrgt => inright _ _
         end) ; clear assoc ; eauto ; subst. constructor. reflexivity.
  firstorder. intros. intro. inversion H ; clear H. firstorder. congruence.
  apply (inrgt y0). assumption.

This isn't the most beautiful code but what it lacks in beauty of implementation it makes up for in ease of reading. There are only a couple of lines that you need to understand to be assured that this is in fact the correct implementation.

Theorem assoc : 
  forall (x:A) (l: assoc_list), 
    {y | In (x,y) l} + {forall (y:B), ~In (x,y) l}.

This function has a type that takes an "x" of type "A", a list of pairs and supplies a value "y" of type "B" with a proof that it came from a pair in which the x provided occurs *AND* that that pair exists in the list l. If the function can't find a pair from which to supply a y, it must supply a proof that no such pair exists in the list.

First a bit of the notation so you can read this theorem properly

{ y | P y }

Can be read: "there exists a y such that the property P holds of y".

The notation:

A + {B}

Means that we must supply an A or a proof of B.

The "In" predicate is an inductive predicate which specifies membership in a list and is provided by the Coq library.

With a little practice reading types, we can see that the two lines are correct. We can then run Coq on this code to check that the proof is correct and we can be assured of the correctness of the implementation WITHOUT EVER READING THE CODE! This means we have reduced the problem of correctness to two simple lines of completely declarative statements.

This function is more complicated than a normal assoc function because it carries all of these proof terms along with it. However, because of the clever people involved in writing Coq we have actually mixed two different type universes together. One (called Set) for values and one (called Prop) for proofs. Coq uses this fact to do something very clever. The {y|P y} existential type and the A+{B} type are designed such that the "P y" proof and the {B} proof disappear when we extract (compile) our code. As an illustration of this amazing fact look at the following ocaml code which has been automatically extracted from our definition:

let rec assoc x = function
  | Nil -> Inright
  | Cons (p, t) ->
      let Pair (x', y) = p in
      (match eq_dec x x' with
         | Left -> Inleft y
         | Right -> assoc x t)

Voila! Not only are the proof terms gone, but this is pretty much the code you probably would have written yourself. Notice that "Inright" is a constructor with no information (other than failure), basically implementing the "Nothing" of a Maybe type, and the "Inleft" as the "Just" of a Maybe type.

We have extracted a provably terminating, well specified, totally correct function that runs in ocaml! And to top it off we can also compile to Haskell!!

assoc :: A -> Assoc_list -> Sumor B
assoc x l =
  case l of
    Nil -> Inright
    Cons p t ->
      (case p of
         Pair x' y ->
           (case eq_dec x x' of
              Left -> Inleft y
              Right -> assoc x t))

So it isn't as if this isn't without some cost. There is a huge learning curve on Coq and I'm by no means an expert. The proofs can be arduous, and although this one only took me a couple of minutes, it took far longer than it would have to implement the function directly in ocaml. However as I gain experience with Coq, I increasingly believe the approach will scale better than traditional approaches to software design.

In addition to these caveats there is more than one function that satisfies this type. Namely there is an assoc which starts returning from the end of the list, instead of the beginning. We could get rid of this problem by specifying which of the elements should be returned, returning all of them, or restricting formation of assoc's such that they are mappings. The latter is very elegant, but somewhat more involved than what we have done.

Functional programming reduces the difficulty of writing correct software by reducing the ways in which bugs can occur. However, most functional programming languages are unable to put complex constraints like "this is a sorted list" or "implements an assoc function" into the language. When other functions rely on the behaviour all kinds of funny things can happen.

I recently had a nasty non-termination bug in SML because a function *required* that the input list be sorted or the function wouldn't terminate. My sort routine made use of a bogus less-than-equal predicate over a data-type which I had neglected to write properly. This caused hours of very difficult debugging.

I'm acting on the thesis that I can avoid these problems and I am rewriting a large program in Coq as an experiment. I'll let you know how it goes.


Popular posts from this blog

Managing state in Prolog monadically, using DCGs.

Prolog is a beautiful language which makes a lot of irritating rudimentary rule application and search easy. I have found it is particularly nice when trying to deal with compilers which involve rule based transformation from a source language L to a target language L'.

However, the management of these rules generally requires keeping track of a context, and this context has to be explicitly threaded through the entire application, which involves a lot of irritating and error prone sequence variables. This often leads to your code looking something a bit like this:

compile(seq(a,b),(ResultA,ResultB),S0,S2) :- compile(a,ResultA,S0,S1), compile(b,ResultB,S1,S2).
While not the worst thing, I've found it irritating and ugly, and I've made a lot of mistakes with incorrectly sequenced variables. It's much easier to see sequence made explicitly textually in the code.

While they were not designed for this task, but rather for parsing, DCGs turn out to be a convenient …

Generating etags automatically when needed

Have you ever wanted M-. (the emacs command which finds the definition of the term under the cursor) to just "do the right thing" and go to the most current definition site, but were in a language that didn't have an inferior process set-up to query about source locations correctly (as is done in lisp, ocaml and some other languages with sophisticated emacs interfaces)?

Well, fret no more. Here is an approach that will let you save the appropriate files and regenerate your TAGS file automatically when things change assuring that M-. takes you to the appropriate place.

You will have to reset the tags-table-list or set it when you first use M-. and you'll want to change the language given to find and etags in the 'create-prolog-tags function (as you're probably not using prolog), but otherwise it shouldn't require much customisation.

And finally, you will need to run etags once manually, or run 'M-x create-prolog-tags' in order to get the initia…

Decidable Equality in Agda

So I've been playing with typing various things in System-F which previously I had left with auxiliary well-formedness conditions. This includes substitutions and contexts, both of which are interesting to have well typed versions of. Since I've been learning Agda, it seemed sensible to carry out this work in that language, as there is nothing like a problem to help you learn a language.

In the course of proving properties, I ran into the age old problem of showing that equivalence is decidable between two objects. In this particular case, I need to be able to show the decidability of equality over types in System F in order to have formation rules for variable contexts. We'd like a context Γ to have (x:A) only if (x:B) does not occur in Γ when (A ≠ B). For us to have statements about whether two types are equal or not, we're going to need to be able to decide if that's true using a terminating procedure.

And so we arrive at our story. In Coq, equality is som…