Skip to main content

More Questions than Answers

Code transformation or meta-compilation as it is sometimes called (which is the general notion of techniques including Partial Evaluation, Supercompilation, Deforestation, or my advisors Distillation), is a powerful technique in computer programming. The benefits (and drawbacks) are almost certainly not sufficiently studied.

I was just conversing with my room-mate Tom about meta-compilation and I made the supposition that Meta-compilers are somewhat like the technology of the lathe. There are a huge number of technologies that require a lathe in order to be produced efficiently. A lathe can be viewed as a major nexus in the dependency graph of machining technology. A lathe is an almost a completely fixed precondition for the mill. The Mill is the crux of modern machining. It allows you to construct almost any currently available machined part. Without the mill we really wouldn't have the industrial age at all. Do such things exist in computer programming?

Metacompiler technology is incredibly powerful. It is a technique that usually is concidered to be a superset of a partial-evaluator. It is a compiler technique that starts in the source language and ends in the source language rather than some target language as does a standard compiler. While this might at first sound trivial or irrelevant a few examples can convince one that it actually a very useful tool. (2*2) can be coded in most languages, but really it is just the literal 4. Partial-evaluation will reduce this computation at compile-time elminating the cost from the final executable. The power doesn't stop there though. One particularly convincing example that I found was the partial evaluation of fairly simple grammar recogniser (parser) which reduced a problem directly from an NDFA to a DFA. Which is basically the compilation process used for regexps.

The Futamura projections give us some idea of just how powerful the technique is. If we have a metacompiler, we can metacompile an intepreter with respect to a program writen in the source language of the interpreter to arive at an executable in the language of the metacompiler. In fact, if we metacompile the metacompiler with respect to the interpreter and we can generate a compiler!

So I have a *lot* of questions about metacompilation. It sounds almost too good to be true (but there are good reasons to believe that it isn't). Some of them are very technical which I will probably save for tomorrow's post. The following question though is more philosophical, and practical (can those two happen at the same time?)

Why aren't supercompilers/partial evaluators used as general compilation systems? If you can write a supercompiler in some high level, nice language like OCaml and then all you have to do is write an interpreter for your language of choice in order to produce a compiler, then why isn't this done?

This seems like the holy grail of leveraging, or code re-use. You could write one really good compiler for a good language for specifying languages (Which ML was originally designed for, and of which OCaml is a descendant). One really good metacompiler. At this point every other language (front end, in the terminology of GCC) is simply the act of writing an interpreter. Writing an interpreter is *radically* simpler than making a sophisticated compiler. It is basically equivalent to a specification for the language. The process of language design can hardly be facilitated more than this since interpreters are pretty much the minimal requirement for specifying the operational semantics of a language!

My question is why isn't this general procedure really carried out in practice? Are metacompilers not good enough in practice to produce high quality performant programs? Has it just not been tried? If not, I'd like to see some effort expended on this, since it seems like a crucial technology that could really be leveraged far more than any of the "shared VM" projects like C# with minimal cost to language implementors.


Popular posts from this blog

Managing state in Prolog monadically, using DCGs.

Prolog is a beautiful language which makes a lot of irritating rudimentary rule application and search easy. I have found it is particularly nice when trying to deal with compilers which involve rule based transformation from a source language L to a target language L'.

However, the management of these rules generally requires keeping track of a context, and this context has to be explicitly threaded through the entire application, which involves a lot of irritating and error prone sequence variables. This often leads to your code looking something a bit like this:

compile(seq(a,b),(ResultA,ResultB),S0,S2) :- compile(a,ResultA,S0,S1), compile(b,ResultB,S1,S2).
While not the worst thing, I've found it irritating and ugly, and I've made a lot of mistakes with incorrectly sequenced variables. It's much easier to see sequence made explicitly textually in the code.

While they were not designed for this task, but rather for parsing, DCGs turn out to be a convenient …

Automated Deduction and Functional Programming

I just got "ML for the working programmer" in the mail a few days ago,
and worked through it at a breakneck pace since receiving it. It
turns out that a lot of the stuff from the "Total Functional
Programming" book is also in this one. Paulson goes through the use
of structural recursion and extends the results by showing techniques
for proving a large class of programs to be terminating. Work with
co-algebras and bi-simulation didn't quite make it in, except for a
brief mention about co-variant types leading to the possibility of a
type: D := D → D which is the type of programs in the untyped lambda
calculus, and hence liable to lead one into trouble.

I have to say that having finished the book, this is the single most
interesting programming book that I've read since "Paradigms of
Artificial Intelligence Programming" and "Structure and Interpretation
of Computer Programs". In fact, I would rate this one above the other
two, though …

Decidable Equality in Agda

So I've been playing with typing various things in System-F which previously I had left with auxiliary well-formedness conditions. This includes substitutions and contexts, both of which are interesting to have well typed versions of. Since I've been learning Agda, it seemed sensible to carry out this work in that language, as there is nothing like a problem to help you learn a language.

In the course of proving properties, I ran into the age old problem of showing that equivalence is decidable between two objects. In this particular case, I need to be able to show the decidability of equality over types in System F in order to have formation rules for variable contexts. We'd like a context Γ to have (x:A) only if (x:B) does not occur in Γ when (A ≠ B). For us to have statements about whether two types are equal or not, we're going to need to be able to decide if that's true using a terminating procedure.

And so we arrive at our story. In Coq, equality is som…