Tuesday, 20 March 2018

Generating etags automatically when needed

Have you ever wanted M-. (the emacs command which finds the definition of the term under the cursor) to just "do the right thing" and go to the most current definition site, but were in a language that didn't have an inferior process set-up to query about source locations correctly (as is done in lisp, ocaml and some other languages with sophisticated emacs interfaces)?

Well, fret no more. Here is an approach that will let you save the appropriate files and regenerate your TAGS file automatically when things change assuring that M-. takes you to the appropriate place.

You will have to reset the tags-table-list or set it when you first use M-. and you'll want to change the language given to find and etags in the 'create-prolog-tags function (as you're probably not using prolog), but otherwise it shouldn't require much customisation.

And finally, you will need to run etags once manually, or run 'M-x create-prolog-tags' in order to get the initial TAGS file so that the code knows where you think the appropriate tags file should be.

;; Automatic Etags for prolog
(setq tags-table-list

(defun create-prolog-tags (dir-name)
  "Create prolog tags file."
  (interactive "DDirectory: ")
   (format "cd %s; find %s -type f -name \"*.pl\" | etags -l prolog -"
     dir-name dir-name)))

(defadvice xref--find-definitions (around refresh-etags activate)
  "Rerun etags and reload tags if tag not found and redo find-tag.              
    If buffer is modified, ask about save before running etags."
  (let ((extension (file-name-extension (buffer-file-name))))
 (condition-case err
   (error (and (buffer-modified-p)
      (not (ding))
      (y-or-n-p "Buffer is modified, save it? ")
    (er-refresh-etags extension)

(defun er-refresh-etags (&optional extension)
  "Run etags on all peer files in current dir and reload them silently."
  (let ((my-tags-file (locate-dominating-file default-directory "TAGS")))
 (when my-tags-file
   (message "Loading tags file: %s" my-tags-file)   
   (let ((tags-dir (file-name-directory my-tags-file)))
  (create-prolog-tags tags-dir)
    (visit-tags-table my-tags-file)))))

Friday, 28 April 2017


I was ironing my shirt today, which I almost never do. Because of this I don't have an ironing board so I tried to make a make-shift ironing board on my floor using a towel and some books. I grabbed the heaviest books on the nearest shelf, which happened to be Organic Chemistry, Stalingrad and an annotated study bible containing the old and new testament.

As I pulled out the bible, a flower fell out which had been there for over 17 years now. I know that because it was put there by my first wife, Daniel, who killed herself in April about 17 years ago.

It fell from Thessalonians to which it had been opened partially. Highlighted was the passage: "Ye are all sons of the light and sons of the day."

I guess the passage gave her solace. Daniel was a complicated woman. She had serious mental health issues which plagued her for her entire life. None of them were her fault. She was dealt an absolutely awful hand in life, some truly nasty cards.

She had some considerable cards to play despite this. She was fierce intelligent and stunningly beautiful. She didn't always play her cards well though, and I think it is because she didn't want to as she felt like she should be made to suffer.

I loved her tremendously even if she was hard to love. She didn't love herself and didn't take very kindly to others doing it. If she thought you might love her she'd test you, and the tests could be pretty awful.

Despite this, she loved others as intensely as she tested them, and sometimes she could be the a truly beautiful and generous person. She was a bundle of contradictions.

She was incredibly intense. She was hot as hell fire, a searing intelligence and a fierce biting anger.

I remember vividly the day she died. She was working as tech support in a university computer lab. That day she had brought a liter of Vodka in to work in a plastic water bottle and I guess had nearly passed out from drinking. They checked the bottle and found it full of alcohol. They promptly fired her on the spot.

I came home from my after-school job to find my friend there, and her in a serious state of what seemed to be drunkenness, anger and irritation. She told me she was fired and was screaming at me, I think, that I should send my friend home. I ignored her.

It was a terribly hot day in Albuquerque, the kind where the sun just will not leave you be, and leaving did not appeal. I was huddled in my living room on the shabby couch playing guitar while my friend used my computer, both of us trying to avoid the sunlight and pretending not to hear Daniel's protestations.

She said, "Fuck it! I'll just kill myself!" I looked up and shouted, "Do it then and stop screaming at me!". She went directly into the next room and shot herself in the head.

The shot fired and I shot up off the couch, threw the guitar and screamed at the top of my lungs a scream of terror. I ran into the bathroom, and then into the bedroom where she was and saw her with half her head missing. I screamed again several times and then began running around the house turning over tables and trying to turn a bookshelf over on myself with my friend trying to restrain me. He eventually dragged me physically from the house, slapping me repeatedly saying that I couldn't destroy the house or I'd be likely to be charged with murder. The whole thing made no sense.

I spent the next day in a stupor. I think I wasn't even drinking, I was just in shock. I couldn't stay in the home she had been shot, so I went to my fathers. I remember doing group theory homework. My dad kept asking me if she was involved with drugs which I just ignored. I could hear my dad and her father trying to make sense of it by recourse to the causes: the wrong friends or drugs or drug dealing or alcoholism or something.

I couldn't stand their talk. Alcohol was involved. Maybe drugs were. Those were just symptoms anyhow, not causes. I left for a friends house, where I stayed playing video games and drinking for days until my mother came to collect me and we flew home to Alaska for the funeral.

The funeral was a disaster. Her mother turned it into the most absurd evangelical Baptist affair imaginable with the most reactionary pastor in Anchorage. People asked me if I wanted to push for something different. I just ignored it. Her mother could have what she wanted. And her father, an atheist, I had no idea what to do for him given that he'd lost his only child.

Daniel sometimes believed in god and when she did she hated him for what he allowed happen in the world. And I really mean hate with a visceral screaming bitter hatred, and by the look of it, it was more extreme than anything I've ever felt. Sometimes she didn't believe in anything, and I think this made her feel better. And sometimes she believed that there must be some kind of way out, a loophole, and maybe she could find it in the bible.

People would say things to me. Things that I guess were meant to help, but one thing I heard a number of times really stuck with me. "That was a really terrible thing she did to you Gavin" and "That was so selfish". It was a surprisingly common condolence.

I'll be honest, it made me want to beat the shit out of them. How the fuck was anyone to say something like that. Nobody knows what she was going through. It was worse than I can imagine. I've felt like offing myself before, but I know I've never had the protracted intense debilitating pain that she experienced.

It took me years to talk about it. I spent years being really reckless and drinking far too much. The first few years were filled with me starting unprovoked fights in bars or places I can't even remember I was so drunk. And I only talked about it while drunk. Over the last 10 years or so I've been able to say things about it to people sober. First occasionally, then a bit more casually.

I guess there should be some sort of moral. I don't know what it is. I don't know what I should have done to help her, probably something. Probably others should have and didn't either. And probably there is something which should be said with how to cope with suicide. If there is, I have no idea what it is. It does seem to get easier with time.

This is the first time I've written down the experience. I'm glad she put that flower there.

We'll miss you teagrey.

Saturday, 2 July 2016

Some thoughts on the meaning of equality and cyclic proof - can CTT help were ITT hinders?

The CTT (Computational Type Theory) approach to type theory, as represented in NuPRL and described in [1][2][3], provides a contrasting methodology to what appears to be in vogue at the moment - an approach known as ITT (Intensional Type Theory) as represented in modern type-theory based proof assistants such as Coq and Agda.

There is a very good exposition by Jon Sterling which describes the more fundamental philosophical differences here [4]. It's somewhat difficult to convey the subtleties without such a complete exposition, but since its precisely this that I want to talk about, I'll try to recapitulate the essential differences.

When we talk about the elements of a type in ITT, we give the process of their construction in such a way that the elements which inhabit the type are made entirely explicit in the definition of the type.  So, for instance, in defining the data type for the natural numbers we give zero and succ explicitly as the elements of this type.  The type relation (n : ℕ) is then not an abstract description of what would fit into this type, but a description of the syntax which allows valid types to be formed.

data  : Set where
  zero : 
  succ :   

However, there is more to the story as well. As I mentioned in a previous blogpost here, the meaning of computation and equality in ITT is both intertwined and somewhat hidden from view. We say in ITT something along the lines of: "we mean for two things to be equal when they are the same thing, and the same thing is understood to mean identification of proof-trees under the reduction relation"

But why do we allow identity of proofs according to this reduction and not others and how do we know which and under what circumstances. And where exactly do we define the way that the natural numbers are treated in terms of their behaviour? It turns out that the elimination rules are derived from the above datatype automatically, and then wrapped up nicely for us behind the scenes.

The approach in CTT does these two things very differently. When defining the type ℕ, we instead describe what would constitute evidence that we were a natural number, and what would constitute evidence of equality between two natural numbers. Here, we have to explicitly describe the big-step evaluation judgement which acts on our elements so we know what constitutes a canonical element of the type.

There are a number of ramifications of this approach. One outcome which is of particularly interest to me (due to my current work in RDF, of which I'll describe more later) is that subtyping can be much more direct as data can be a canonical element of more than one type. In ITT this is impossible as the syntax itself describes what constitutes a type.

Circularity and Identity

Which leads me to something which has made me uncomfortable for quite a long time regarding what constitutes a proof. Syntax, in ITT is designed (especially after elaboration) to be the proof. There is virtually no difference between the proof-tree and a fully elaborated proof term and if there is, it's only up to some decidable procedure which fills in gaps which the designer of the syntax didn't think was worth bothering with as it could be elaborated just-in-time. Identification of proofs is done under reduction relation. Unlike in CTT, two proofs can have precisely the same abstract shape, and yet be utterly distinct due to naming (of for instance constructors) alone.

This might not seem like it constitutes much of a problem, but when it comes to cyclic proof, it turns out to be fairly irritating. The reason is that we have decided apriori that there is no possible way to identify two proofs as they will never have the same syntax because induction itself comes equipped with names. Since each proof step is in fact a syntactic step, terms in our term language acts as descriptions of how to build the rest of the proof tree. These names will get all messed up when we start manipulating cyclic proofs and our terms stop corresponding to the naming of the rest of the proof.

To see this we can look at the dpu (double-plus-unbad) function, its proof tree, and another potential cyclic proof which we will call dpg (double-plus-good) - which is semantically equivalent. By semantically equivalent we mean that they cannot be distinguished within the theory.

In order to define dpu and dpg we first need to write down a simple addition function. First we'll show these functions as they would be presented in Agda.
_+_ :     
zero + m = m
succ n + m = succ (n + m) 

double-plus-unbad :       
double-plus-unbad x y z = (x + y) + z

double-plus-good :       
double-plus-good zero y z = y + z
double-plus-good (succ x) y z = succ (double-plus-good x y z)
We are going to use an imaginary programming language to express our cyclic proofs whose syntax should hopefully be clear enough to the reader. In particular, we steal Agda's pattern lambda syntax to describe case distinction on inductively defined types (λ {zero → M ; succ n → P}).

                                                                  n : ℕ , m : ℕ ⊢ n + m : ℕ  
                                       ──────────────var      ───────────────────────────────succ
                                      m : ℕ ⊢ m : ℕ            n : ℕ , m : ℕ ⊢ succ (n + m) : ℕ  
─────────────var ─────────────var  ──────────────────────────────────────────────────────────case(†)
n : ℕ ⊢ n : ℕ     m : ℕ ⊢ m : ℕ        n : ℕ , m : ℕ ⊢ λ { zero → m ; succ n → succ (n + m)} n
        n : ℕ , m : ℕ ⊢ (λ n m → λ { zero → m ; succ n → n + m} n) n m : ℕ 
                             n : ℕ , m : ℕ ⊢ n + m : ℕ

We here demonstrate a cyclic proof presentation[5]. Under usual conditions we would have instead made appeal to some least-fixedpoint operator or have put _+_ in a function-variable-context or the like. Additionally, in order to show that this construction was a well defined (total) function we would have cause to make an appeal to some termination argument. Here we have structural recursion on our side. We can reuse this argument from the usual setting without alteration.

In addition, the circularity is intentionally made more explicit in this first example by virtue of an over-clever naming strategy and the shadowing behaviour of lexical binding, but in real practice of cyclic proof (and in the examples below) there would need to be a substitution applied to obtain the original expression. We will leave this substitution implicit as its not important here (but it would be very important in a more formal description).

You also may have noticed that I leave out a proof rule for the scrutinee of the "case"-rule in these proofs as it should be apparent from context (elaboration here should be performed by the reader).

But why should we want to display this proof as a cyclic proof instead of the more usual least fixed-point approach which is less obscure? Hopefully this will become more apparent with the inclusion of our dpu / dpg functions. We give the dpu function's cyclic proof here:

────────────────────────(*) ─────────────var  ────────────────────────────────────────────────────(†)
x : ℕ , y : ℕ ⊢ x + y : ℕ     z : ℕ ⊢ z : ℕ     n : ℕ , m : ℕ ⊢ λ { zero → m ; succ n → succ (n + m)} n
     x : ℕ , y : ℕ , z : ℕ ⊢ (λ n m → λ { zero → m ; succ n → succ (n + m)} n)) (x + y) z : ℕ 
                               x : ℕ , y : ℕ , z : ℕ ⊢ (x + y) + z : ℕ
                             x : ℕ , y : ℕ , z : ℕ ⊢ dpu x y z : ℕ 

In this proof we make reference to two different entry points into the proof of plus. These correspond with the two different uses of plus which are syntatically apparent in the term, so it is not surprising that there should be two different cyclic references. However, in dpg, we only refer to _+_ once.
                                                     x : ℕ , y : ℕ , z : ℕ ⊢ dpg x y z : ℕ 
                    ───────────────────────(*) ──────────────────────────────────────────────────succ
                    y : ℕ , z ∶ ℕ ⊢ y + z : ℕ   x : ℕ , y : ℕ , z : ℕ ⊢ succ x → succ (dpg x y z) : ℕ 
───────────────────var   x : ℕ , y : ℕ , z : ℕ ⊢ λ { zero → y + z                                        
x,y,z : ℕ ⊢ x,y,z : ℕ                                 ; succ x → succ (dpg x y z)} x : ℕ 
  x : ℕ , y : ℕ , z : ℕ ⊢ (λ x y z → λ { zero → y + z
                                         ; succ x → dpg x y z} x) x y z : ℕ
   ──────────────────────────────────────────────────────────────────────unfold (§)
                           x : ℕ , y : ℕ , z : ℕ ⊢ dpg x y z : ℕ

This dpg, I claim, is semantically equivalent to dpu. That is, it always gives the same result, but does so in one pass instead of two. However, not only is it semantically equivalent, but it is derivable according to a number of manipulations of a cyclic proof tree. The term dpg is the term which would be arrived at by carrying out a process of super-compilation [6].

To see this, let's watch how dpu would "unfold". In order to do this properly we are going to need an extended reduction relation. Let us extend reduction with "case-distribution" which is given in the following rule.
(λ { zero → M ; succ x → N }) ((λ {zero → P ; succ x' → Q}) R)
((λ {zero → (λ { zero → M ; succ x → N }) P ; succ x' → (λ { zero → M ; succ x → N }) Q}) R)

This yields semantically equivalent terms for any closed terms, even in the case of non-termination.
─────────────────────────────────────(†)            n : ℕ , y : ℕ , z : ℕ ⊢ (n + y) + z : ℕ 
n : ℕ , m : ℕ ⊢ λ { zero → m                      ───────────────────────────────────────────succ
                     ; succ n → succ (n + m)} n    n : ℕ , y : ℕ , z : ℕ ⊢ succ ((n + y) + z) : ℕ
x : ℕ , y : ℕ , z : ℕ ⊢ λ { zero → (λ { zero → z
                                        ; succ n → succ (n + z)} y)
                            ; succ n → succ ((n + y) + z)} x : ℕ 
x : ℕ , y : ℕ , z : ℕ ⊢ λ { zero → (λ { zero → z
                                        ; succ n → succ (n + z)} y)
                            ; succ n → (λ { zero → z
                                          ; succ n → succ (n + z)} (succ (n + y)))} x : ℕ 
x : ℕ , y : ℕ , z : ℕ ⊢ λ { zero → z
                            ; succ n → succ (n + z)} (λ { zero → y
                                                         ; succ n → succ (n + y)} x) : ℕ 
                x : ℕ , y : ℕ , z : ℕ ⊢ λ { zero → z
                                            ; succ n → succ (n + z)} (x + y)) : ℕ 
    x : ℕ , y : ℕ , z : ℕ ⊢ (λ n m → λ { zero → m ; succ n → n + m} n) (x + y) z : ℕ 
                           x : ℕ , y : ℕ , z : ℕ ⊢ (x + y) + z : ℕ
                             x : ℕ , y : ℕ , z : ℕ ⊢ dpu x y z : ℕ
Ok, so this isn't precisely the same, we return to (†) instead of (*) and there is an intermediate lambda application which is reduced (and could have been reduced in the original cyclic proof). However, modulo these few unobservable actions, they are the same, and they can be made precisely the same computationally, if we squint our eyes and ignore the inconsequential (↝)-rules.

The reductions that take place in our proof (the (↝)-rules) are "unobservable". This is because finite computations which take place without emitting constructors are not observable in type theory. What we have done is introduced a more flexible reduction relation, which always has the same semantics on closed terms, and then utilised this to fold the proof in a new way. Our folds are performed with simple substitutions of variables. It turns out that if you fold the proof in a new way using variable substition in such a way that the final term terminates, (or co-terminates)[7] you have absolute freedom in folding your proof and you can't go wrong. Here we've created a term which is structurally recursive and so we are safe from evil.

But now we have two strange things going on. We have the terms, which tell us what the rest of the proof tree is supposed to look like, meaning very little about what the rest of the proof tree is going to look like as they are supposed to do in ITT. The second thing is that we have cycles with totally different "names" at each step. That is, the terms at our cycles look pretty difficult to inter-relate and even when we have the same observable "edges" (such as succ) we have lots of intermediate cruft with apparently different terms. In fact, the only way to know whether one term name is like another is to see how it unfolds and refolds. But this is in general something which can't possibly have a decision procedure.

Should the fact that it doesn't have a general decision procedure stop us though? Perhaps there is a reasonable decision procedure that would recognise lots of "names" as being the same. Some process by which we could unfold a lot, but not too much that we don't terminate, and fold in a safe way. Since supercompilation terminates, this could give us our cycles, allow us to compare proofs and admit a larger class of equivalences. In fact, the interested reader might note that the above process will reduce x + (y + z) to essentially the same proof, giving us a proof of associativity for free.

But this leads us to further questions. When we started with the reduction relation in ITT we identified different proofs (such as (λ x. f x) y ≈ f y) but we consider this simple enough that we hope not to do anything dodgy. The rest of the proof we make really explicit with all of the steps written out for us. When we get to supercompilation we have something which is very complex indeed. How are we to trust the terms that come out the other side?

Well, it is possible to prove that case-↝ is a conservative extension of ↝. Similarly, it is conceivable that one could show that there is a semantic equivalence for folding, in the event that we can provide a termination argument.

In CTT we talk about canonical value elements by talking about their reduction behaviour under big-step evaluation. The above arguments feel morally similar and this syntactic cruft is noted in [7]. I'm not yet familiar enough with CTT to know precisely how to formulate things (or to what extent the same problems arise) in this setting. What is clear to me however, is that this argument is a very uncomfortable one to make in ITT without breaking the fundamental design principle which insists on the bureaucracy of syntax everywhere excepting in the original unalterable reduction scheme.

Is there a trick with which we can do this comfortably in ITT? Is there a cleaner way to explain these things in CTT?

EDIT: In general, we don't want proof irrelevance. We'd be quite unhappy in a computational setting to see our proofs of ℕ → ℕ → ℕ → ℕ being replaced arbitrarily (for instance, with one that returns zero in all cases). However, it seems to me that the our notion of proof relevance is sometimes too strong. A notion of proof relevance which would be more flexible would be one of observable equivalence - in which case we at least are assured that we compute to the same canonical forms in all closed contexts.

[1] Naïve Computational Type Theory, Robert L. Constable 2003
[2] The NuPRL Book, The NuPRL Group
[3] Constructing Type Systems over an Operational Semantics, Robert Harper, 1992
[4] Type Theory and its Meaning Explanations, Jon Sterling, 2015
[5] An Introduction to Cyclic Proof, James Brotherston
[6] Introduction to Supercompilation, Morten Heine, B. Sø rensen and Robert Glück
[7] Development of the Productive Forces
[8] Proof Term Assignments and Intuitionistic Truth, Jon Sterling

Friday, 13 March 2015

How to implement transactions for a triple store in Prolog

For the current research project I'm working on, we're using and RDF triple-store and need to be able to store a large numbers of triples, and assure that after any update a number of constraints are satisfied on the structure of the triples according to our ontology.

In order to implement inserts, updates and deletes, they need to be packaged up inside of a transaction that only commits if the constraints are satisfied in the final image of the triple store.  This means we need a way to do a post-condition driven transaction.

One easy way to implement this is to simply copy the entire database, run the insert/update/delete statements and see if we satisfy the transaction.  If we are changing enormous numbers of triples, it might indeed be the fastest and most reasonable approach.   If we are only changing one triple however, this seems a totally mad procedure.

There is another approach and I'll describe it as follows.

Let's assume that our constraint predicate 'constraint/0' is already defined.   What we want to do is keep two alternative layers over the actual triple store, one negative, and one positive.  We then carry out something along the lines of the painters algorithm.  How this works will become more obvious as we write our predicates.  If the constraint C is satisfied on the final version, we then "commit" by deleting everything from the negative graph, and insert everything from the positive graph.

 :- module(transactionGraph,[rdf1/4, insert/4, update/5, delete/4]).  
 :- use_module(library(semweb/rdf_db)).  
 % How to implement post-condition transactions  
 neg(A1,A2) :- atom_concat('neg-', A1, A2).   
 pos(A1,A2) :- atom_concat('pos-', A1, A2).  

What we are going to do is name two new graphs that are derived from our current graphs.  To do this properly we'd want to use a gensym/uuid (which most prologs provide).

 :- rdf_meta xrdf(r,r,t,?).
xrdf(X,Y,Z,G) :- (pos(G,GP), rdf_db:rdf(X,Y,Z,GP)    % If in positive graph, return results
                  *-> true                           % 
    ; (neg(G,GN), rdf_db:rdf(X,Y,Z,GN) % If it's not negative
              *-> false                       % If it *is* negative, fail
       ; rdf_db:rdf(X,Y,Z,G))).        % otherwise bind the rdf result.

This is the meat of the idea.   Our rdf triple store keeps triples in the predicate rdf/4, along with the name of the graph.  We are going to mimic this predicate with some additional logic.  You can safely ignore the 'rdf_meta' clause as it's nothing to do with the logic.

What we want is to check if our triple is in the positive graph.   If not we check to see if it is in the negative graph - negation will be seen as failure.  Notice, if it's in the negative graph we perform a cut using the ! symbol.  This means that we are not going to search any other clauses in rdf1/4 and this keeps us from using the underlying database.

 % you can safely ignore rdf_meta for understanding this programme  
 % it only affects namespace prefix handling.
:- rdf_meta insert(r,r,t,?).
insert(X,Y,Z,G) :- 
    % positive pos graph
    % retract from the negative graph, if it exists.
    (neg(G,G3), rdf_db:rdf(X,Y,Z,G3), rdf_retractall(X,Y,Z,G3) ; true).

Here we have our insert statement.  What we do is assert ourselves into the positive graph and make sure we retract any deletes which may have gone into the negative graph.

:- rdf_meta delete(r,r,t,?).
delete(X,Y,Z,G) :- 
    pos(G,G2), % delete from pos graph
delete(X,Y,Z,G) :- % assert negative
    neg(G, G2),
    % write('Asserting negative fact: '),
    % write([X,Y,Z,G2]),nl,

Here we have the delete statement, which is a little unusual.  First, we delete from the positive graph if there is anything there.  We then fail, after side-effecting the database, because we have to do something further, and we want to run another clause.

We then check to see if we are in the underlying graph, and if we are, we need to insert a negative statement to ensure that the triple from the underlying database will not be visible.

:- rdf_meta update(r,r,o,o,?).
update(X,Y,Z,G,_) :-  
    rdf_assert(X,Y,Z,G3), fail. % delete previous subject if it exists and try update
update(X,Y,Z,G,Action) :- 
    (rdf_db:rdf(X,Y,Z,G2) -> rdf_update(X,Y,Z,G2,Action) % exists in pos graph 
     ; new_triple(X,Y,Z,Action,X2,Y2,Z2),
       rdf_assert(X2,Y2,Z2,G2)). % doesn't yet exist in pos graph (insert). 

Finally, we get to the update.  Updates take an action to define which part of the triple they want to impact.  You can see the function symbol in new_triple/7.  This predicate has to remove the old value from the underlying graph by asserting a negative - it then 'fails', similarly to the delete statement in order to see if we need to update the triple in the positive graph.  If it is not in the positive graph, we need to insert it there as the old value has been negated from the underlying graph.

Now we can query rdf1/4 and we will get behaviour that mimics the actual inserts and updates happening on the underlying store.  We can run our constraints on this rdf1 instead of rdf.  If it succeeds we bulk delete from the negative graph and bulk insert from the positive graph.

Now, if you're clever you might see how this trick could be used to implement nested transactions with post conditions!

Tuesday, 10 March 2015

Judgemental Equality, Homotopy Type Theory and Supercompilation

Ok, time for some incredibly preliminary thoughts on judgemental equality and what it means for Supercompilation and what supercompilation in turn means for judgemental equality.

The more that I explore equality the more incredibly strange the role it plays in type theory seems to be.  There seems intuitively to be nothing more simple than imagining two things to be identical, and yet it seems to be an endless source of complexity.

This story starts with my dissertation, but I'm going to start at the end and work backwards towards the beginning.  Currently I've been reading Homotopy Type Theory, now of some niche fame.  I put off reading about it for a long time because I generally don't like getting too bogged down in foundational principles or very abstract connections with other branches of mathematics (except perhaps on Sundays).  Being more of an engineer than a mathematician, I like my types to work for me, and dislike it when I do too much work for them.

However, the book is trendy, and friends were asking me questions about it, so I picked it up and started working through it.  I haven't gotten very far but I've already had a greater clarity about equality and what it means in reference to computation.

In type theory we start with a couple of notions of equality.  I'm going to stick with judgemental equality first.  Judgmental equality plays a crucial role in type theory as it is really what drives our notion of computation.  If we want to know whether two judgements are judgementally equal, we want to know if they reduce to the same thing.  Generally this is done with a number of reduction rules which give us normalisation to the same term.  The reduction rules are the key element of computation.  The simplest example is cut-elimination for the type A → B.

(λ x : S. t) s ─β→ t [x ↦ s]  (replace all x in t with s)

corresponds with removing implication elimination (or modus ponens):

(J1)        (J2)
 ⋮          ⋮                                    (J1 using J2 as leaves)
S → T      S                                           ⋮
──────────────     ─β→         T

Or in a more explicit type theoretic fashion we can write:

   (J1)                                                         (J2)
     ⋮                                                           ⋮
Γ , x : S ⊢ t : T                                           ⋮
─────────────────                    ⋮                 
Γ ⊢ (λ x : S. t) : S → T                          Γ ⊢ s : S  
        Γ ⊢ (λ x : S. t) s : T                              

                               (J1) [x ↦ s]
─β→    ────────────────────
                      Γ ⊢ t [x ↦ s] : T

In somewhat operational terms we have taken a proof, and we have removed some of the branching structure, and replaced our right branch with any mention of a variable at the leaves of J1.  We can say that in some sense these two proofs are the same, and this is essentially where the computation happens in type theory.  We are engaging in a form of proof normalisation - when you run a programme, you're really just trying to find the simplest (cut free?) proof of the proposition given the data.

Now, things get interesting when you don't have all the data.  Then proofs which are intuitively very similar, end up being fairly distinct just because you can't drive the computation effectively using the reduction rules given for cut elimination.

A straight-forward example of this can be given by the associative property of addition.  First, let's give a program so we can constrain our problem to the bureaucracy of our syntax, such that we can later try to unshackle ourselves from it.

data ℕ : Set where
  zero : ℕ
  suc : ℕ → ℕ

_+_ :  ℕ → ℕ → ℕ
zero + y = y
suc x + y = suc (x + y)

This program in Agda gives us natural numbers formalised as being either 0, or some number +1 (suc), from which we can build all the finite numbers, along with a way to perform addition.

Given this definition we might wonder whether  (x + y + z = x + (y + z)) is true?  Well, if = is judgemental equality, we're in for a rough ride, because it definitely will not be.  Now, things aren't so bad, given a notion of propositional equality we can go ahead and make a similar proof work.  In Agda we might write:

infix 4 _≡_
data _≡_ {A : Set} : A → A → Set where
  refl : (x : A) → x ≡ x

cong : ∀ {A B : Set} {a b : A} → (f : A → B) → a ≡ b → f a ≡ f b
cong f (refl x) = refl (f x)

associative : ∀ (n m o : ℕ) →  n + (m + o) ≡ (n + m) + o
associative zero m o = refl (m + o)
associative (suc x) m o = cong suc (associative x m o)

This proof makes reference to a second kind of equality, a propositional equality _≡_ which is "inside" the proof system.  We have simply represented it as a data type.   By showing two things are equal in this world, we've shown that there is a way to decompose arguments until we have a judgemental equality.

However, if we had some way to make it a judgemental equality, then we would hardly have had to have two cases!  Instead we could immediately just give the proof refl, since they would both reduce to the same thing.  But is this type of reduction possible?   Certainly not in the general case.  However, there is a large lurking universe in which it could be done.

Enter supercompilation!  The sorts of reductions that supercompilers give you will in fact reduce the associative addition statement to identical proofs.  This means that the normal form will be identical on both sides, and we have a judgemental equality.

But what is going on?  First, I think we need to think about cyclic proofs for a minute.   When we use the function _+_, we are actually introducing a cyclic proof.   We don't know how big the  proof is going to be until we get the data.  Instead we have a stand in cycle that represents the way that we will unfold the proof as more data comes in.

The reason we were blocked before is because recursive functions are uncooperative.  What they actually mean to say is that they are going to unfold themselves again to produce new proof leaves as long as you keep feeding them data.   The unfolding leads to new possible cuts to eliminate; more reductions using our rewriting rules.  However they will never cooperate with their neighbour recursive function in trying to produce a new cycle which produces the same infinitely unfolded proof.  They essentially limit the kinds of cycles that can exist in the proof tree to the ones that existed when you wrote down the definitions.

A Supercompiler's fundamental trick is in allowing us to abstract away the function names and just look at the proof as it unfolds creating cycles as we see fit (subject to similar rules that total functions must abide you can't have cycles unless something is always decreasing).  This means that we can create a cycle in the proof, (and perhaps a new named function) which has the same infinite unfolding of the proof tree.

This expansion of judgemental equality can make more proofs easier.  However, it can also make more proofs possible - since it is sometimes the case that we can take a program whose cycles look dubious, and turn them into cycles which look perfectly fine, since they now respect some simple decreasing or increasing order.  This was the subject of my dissertation, but unfortunately I don't think I knew enough type theory at the time to make the story convincing.

There are a lot of ideas that follow on from this.  The ideas of homotopy type theory think a lot about what kinds of proofs can identify two elements in propositional equality.  With a broader notion of judgemental equality, its interesting to wonder how the structure of identifications would change...

Wednesday, 8 May 2013

Monads and Asynchronous Ajax with Javascript

Monads are often seen as a very basic part of the language for haskellers and something completely obscure for programmers which use other, especially non-functional, languages. Even lispers and schemers tend not to use them consciously to any great extent, even though they can help with a number of common problems that occur - perhaps most obviously with functions returning nil in a chain of compositions.

Monads, or Kleisli triples have a formal mathematical meaning which requires satisfaction of a few basic axioms. However, we can understand them fairly simply as a pattern which involves a bind operator which allows us to compose our monad objects, and a unit operator which allows us to lift a value up into one of monad objects.

MonadPlus extends this pattern and allows us a new type of composition +, and a new unit, called mzero that is somewhat like "plus" in the sense that + doesn't do anything when composed mzero.

MonadPlus turns out to be an excellent setting for conveniently dealing with concurrency. So it was that after playing with ajax for a while I found myself irritated with the way in which I had to wrap up continuations in order to get the sequencing behaviour I wanted. It occurred to me that what I really wanted was a MonadPlus.

I started writing one myself and realised that since my knowledge of javascript arcana was somewhat thin (one needs not simply implement them - they also have to be convenient to use!), I'd be better off finding out if someone else had already done it, and hence came to Douglas Crockford's video of javascript monads.

In this video Crockford demonstrates how to roll up a monadPlus object which he calls a vow. In his framework, he has a single composition function when which acts as a combination of bind and +. In order to lift a value into the monad, we use vow.keep(data). Since he doesn't use a strongly typed language, he's able to be a bit more flexible in what actually gets put in his when. In the example functions below, I always return a promise, which can be used by the next function in the chain - this is the standard way that bind works, composing functions (f : a → M b). However, the plumbing written by Crockfort will also automatically lift a function (f: a → b) into the monad, in the event that you have an immediate value and not a promise. Similarly, instead of composition with alternatives, we have our failure function in the right hand side.

This conveniently expresses a combination of "and/or" type asynchronous operators in a single convenient framework. We don't need to worry about when the asynchronous calls return - we are guaranteed that if they do, the next positive function in the chain will be called. If they don't return, we will fall over to our first error handler.

I wrote a quick application that demonstrates how you might use Crockfort's vows:

var iphost = "http://api.hostip.info/get_json.php";
  var countryhost = "http://api.hostip.info/get_json.php?ip=";
  function getIP(){ 
      var vow = VOW.make(); 

      var ajs = { 
          type : 'GET', 
          url : iphost, 
          dataType: 'json', 
          success : function(data){ 
          error: function(x,e){ 
      return vow.promise;
  function countryOf(ip){ 
      var vow = VOW.make(); 

      var ajs = { 
          type : 'GET', 
          url : countryhost + ip, 
          dataType: 'json', 
          success : function(data){ 
          error: function(x,e){ 
      return vow.promise;

  function display(data){ console.log(data);}
  function showError(data){console.log("Error "+data);}

  $(document).ready(function (){
          .when(display, showError)

UPDATE: Much to my surprise, all of this behaviour is already implemented with deferred ajax calls, using "then". It even manages to lift functions which don't return a promise into a promise automatically.

Friday, 19 October 2012

Formalisation of Tables in a Dependent Language

I've had an idea kicking about in my head for a while of making query plans explicit in SQL in such a way that one can be assured that the query plan corresponds to the SQL statement desired. The idea is something like a Curry-Howard in a relational setting. One could infer the plan from the SQL, the SQL from the plan, or do a sort of "type-checking" to make sure that the plan corresponds to the SQL.

The devil is always in the details however. When I started looking at the primitives that I would need, it turns out that the low level table joining operations are actually not that far from primitive SQL statement themselves. I decided to go ahead and formalise some of what would be necessary in Agda in order get a better feel for the types of objects I would need and the laws which would be required to demonstrate that a plan corresponded with a statement.

Dependent types are very powerful and give you plenty of rope to hang yourself. It's always something of a struggle for me to figure out what I want to live in the type world as opposed to what I want to live in the term world.

First things first. We're going to need rows if we want tables. In dynamically typed languages (a.k.a. languages with a [potentially open] universal type) you can easily make a heterogeneous collection of objects (numbers, strings etc). Modelling a row in lisp or python is very straight forward. However, in a strongly typed setting heterogeneous lists are more complicated. The solution I have written in Agda below uses a vector (really just a list with a guaranteed size) of types which lives in the type universe, and which is constructed as we push things into our row. This is basically just a heterogeneous list where the type advertises what kind of row it is.

module RowList where

open import Level renaming (suc to lsuc; zero to lzero)
open import Data.Nat hiding (__)
open import Data.Vec renaming (__ to __; _++_ to _++_; [] to []; [_] to [_])
open import Data.List 
open import Data.Fin renaming (_+_ to _+f_)
open import Data.Nat.Properties
open import Relation.Binary.PropositionalEquality as PropEq
  using (__; refl; sym; subst; trans; cong; cong₂; inspect)
open import Data.Bool

infixr 4 _,_
infixr 5 __

data Row {α : Level} : (n : )  Vec (Set α) n  Set (lsuc α) where 
  · : Row zero []
  _,_ :  {j} {A : Set α} {Aⱼ : Vec (Set α) j}  A  Row j Aⱼ -> Row (suc j) (A  Aⱼ)

In the data definition above we use (·) to terminate a row, and return a row which has an empty-vector in the rows type signature. In the
(,) case, we extend the row by giving an element of some type A and stick its type in our row signature vector. This will allow us to ensure that tables are constructed of homogeneous row types, and will allows us to present a lookup function for rows which knows what its return type will be.

You may notice all the bits about (Set α) etc. The α is just book keeping to avoid Russell-type paradoxes creeping in. They are otherwise not particularly enlightening and the above table definition might be more easily read if you ignore them.

example : Row 2 (  (  []))
example = (1 , 2 , ·)

pick :  {α i j A}  (v : Vec (Fin j) i)  Vec {α} A j  Vec A i
pick [] ys = []
pick (x  xs) ys = lookup x ys  pick xs ys 

__ :  {α j Aⱼ}  (r : Row {α} j Aⱼ)  (i : Fin j)  lookup i Aⱼ 
·  ()
(y , ys)  zero = y
(y , ys)  suc i = ys  i 

We call our row selection operation (r i) for a row (r) and index i. Notice that the return type is dependent on the type signature vector which is carried around in the type of the row. This selector will be useful for writing predicates over columns which will have to know what their input type is. We use a finite type (Fin j) to describe our index, since we need to make sure we don't index outside of the row. The finite type ensures us that whatever the index is, it's no greater than the size of the vector. These operations are now guaranteed to be type safe and can not lead to run-time out-of-range errors.

_++r_ :  {α i j Aᵢ Bⱼ}  Row {α} i Aᵢ  Row j Bⱼ  Row (i + j) (Aᵢ ++ Bⱼ)
· ++r r = r
(x , xs) ++r r = x , (xs ++r r)

We of course need to be able to combine rows from different tables in joins, and so we'll need a row concatenation operator, which we call (++r).

pickr :  {α i j Aⱼ}  (v : Vec (Fin j) i)  Row {α} j Aⱼ  Row i (pick v Aⱼ)
pickr [] r = · 
pickr (x  xs) r = r  x , pickr xs r

Additionally, it's useful if we can specify a list of indices we want to pick out of a row to construct a new row. This will allow us to make a "select" like statement where we can duplicate or ignore indices in order to create a new row, which when done over a list of rows will become more like the familiar table select.

Table :  {α} (m : ) (Aⱼ : Vec (Set α) m)  Set (lsuc α)
Table {α} m Aⱼ = List (Row m Aⱼ)

We define tables as simply a list of rows, and keep the indices and types as parameters to the type. The table header is now correct by construction and exists only as a type index.

Example table: The header is specified as the type
exampleTable : Table 3 (  (  (  [])))
exampleTable = (1 , 2 , 3 , ·)
             ((3 , 4 , 5 , ·)
             ((5 , 6 , 7 , ·)

__ :  {α i j Aᵢ Bⱼ}  Table {α} i Aᵢ  Table j Bⱼ  Table (i + j) (Aᵢ ++ Bⱼ)
[]  [] = []
[]  _  = []
_   [] = []
(x  xs)  (x'  xs') = x ++r x'  [ x ]  xs' ++ xs  (x'  xs') 
-- concatenate row, union row with the rest of the table

The familiar join operation. As you can see here, our primitive join operation does not take a restriction. In practice the primitive join would have to use the restriction in order to reduce the amount of row consing. In order to have a pragmatic optimiser, I'll have to change this later.

Example join
result : Table 6 (  (  (  (  (  (  []))))))
result = exampleTable  exampleTable

-- Restriction 
__ :  {α j Aⱼ}  Table {α} j Aⱼ   (Row j Aⱼ  Bool)  Table j Aⱼ
[]  p = []
(x  xs)  p = if p x then x  xs  p else xs  p 

Restrictions are just row predicates. They simply take a row and specify which rows to keep and which to throw away. This allows for most of the usual SQL WHERE clause shenanigans (minus aggregation of course) and gives us a fairly complete method of doing joins subject to restrictions [we don't deal with nulls at all].

Example restriction
_b_ :  (n m : )  Bool
zero b m = true
suc n b zero = false
suc n b suc n' = n b n'

restrict : Table 6 (  (  (  (  (  (  []))))))
restrict = result  (λ v  (v  (# 1)) b 1)
Finally we have restriction, which allows us to pick and choose columns by specifying their index. It would of course be more natural to have names for columns and allow that as a means of addressing the columns of interest and I may in fact add that to rows in order to simplify the debugging, but it's not of much interest logically speaking.
-- Selection 
__ :  {α i j Aⱼ}  (v : Vec (Fin j) i)  Table {α} j Aⱼ  Table i (pick v Aⱼ)
xs  [] = []
xs  (y  ys) = (pickr xs y)  xs  ys

Example selection 
-- Syntax of a query: (selector ∥ t1 ∪ t2 ∪ t3 ▹ restriction)

selection : Table 2 (  (  []))
selection = ((# 1)  ((# 3)  []))  result  (λ v  (v  (# 1)) b 4)

From this point we will need some form of relation which allows us to establish query equivalence. We will then want a way to compute whether two tables are equivalent up to this relation and we can continue on with the more pragmatic concerns of pushing the restrictions into the actual table joins.