Skip to main content

How to implement transactions for a triple store in Prolog

For the current research project I'm working on, we're using and RDF triple-store and need to be able to store a large numbers of triples, and assure that after any update a number of constraints are satisfied on the structure of the triples according to our ontology.

In order to implement inserts, updates and deletes, they need to be packaged up inside of a transaction that only commits if the constraints are satisfied in the final image of the triple store.  This means we need a way to do a post-condition driven transaction.

One easy way to implement this is to simply copy the entire database, run the insert/update/delete statements and see if we satisfy the transaction.  If we are changing enormous numbers of triples, it might indeed be the fastest and most reasonable approach.   If we are only changing one triple however, this seems a totally mad procedure.

There is another approach and I'll describe it as follows.

Let's assume that our constraint predicate 'constraint/0' is already defined.   What we want to do is keep two alternative layers over the actual triple store, one negative, and one positive.  We then carry out something along the lines of the painters algorithm.  How this works will become more obvious as we write our predicates.  If the constraint C is satisfied on the final version, we then "commit" by deleting everything from the negative graph, and insert everything from the positive graph.

 :- module(transactionGraph,[rdf1/4, insert/4, update/5, delete/4]).  
 :- use_module(library(semweb/rdf_db)).  
 % How to implement post-condition transactions  
 neg(A1,A2) :- atom_concat('neg-', A1, A2).   
 pos(A1,A2) :- atom_concat('pos-', A1, A2).  

What we are going to do is name two new graphs that are derived from our current graphs.  To do this properly we'd want to use a gensym/uuid (which most prologs provide).

 :- rdf_meta xrdf(r,r,t,?).
xrdf(X,Y,Z,G) :- (pos(G,GP), rdf_db:rdf(X,Y,Z,GP)    % If in positive graph, return results
                  *-> true                           % 
    ; (neg(G,GN), rdf_db:rdf(X,Y,Z,GN) % If it's not negative
              *-> false                       % If it *is* negative, fail
       ; rdf_db:rdf(X,Y,Z,G))).        % otherwise bind the rdf result.

This is the meat of the idea.   Our rdf triple store keeps triples in the predicate rdf/4, along with the name of the graph.  We are going to mimic this predicate with some additional logic.  You can safely ignore the 'rdf_meta' clause as it's nothing to do with the logic.

What we want is to check if our triple is in the positive graph.   If not we check to see if it is in the negative graph - negation will be seen as failure.  Notice, if it's in the negative graph we perform a cut using the ! symbol.  This means that we are not going to search any other clauses in rdf1/4 and this keeps us from using the underlying database.

 % you can safely ignore rdf_meta for understanding this programme  
 % it only affects namespace prefix handling.
:- rdf_meta insert(r,r,t,?).
insert(X,Y,Z,G) :- 
    pos(G,G2),
    % positive pos graph
    rdf_assert(X,Y,Z,G2), 
    % retract from the negative graph, if it exists.
    (neg(G,G3), rdf_db:rdf(X,Y,Z,G3), rdf_retractall(X,Y,Z,G3) ; true).

Here we have our insert statement.  What we do is assert ourselves into the positive graph and make sure we retract any deletes which may have gone into the negative graph.


:- rdf_meta delete(r,r,t,?).
delete(X,Y,Z,G) :- 
    pos(G,G2), % delete from pos graph
    rdf_db:rdf(X,Y,Z,G2),
    rdf_retractall(X,Y,Z,G2), 
    false.
delete(X,Y,Z,G) :- % assert negative
    rdf_db:rdf(X,Y,Z,G),
    neg(G, G2),
    % write('Asserting negative fact: '),
    % write([X,Y,Z,G2]),nl,
    rdf_assert(X,Y,Z,G2).

Here we have the delete statement, which is a little unusual.  First, we delete from the positive graph if there is anything there.  We then fail, after side-effecting the database, because we have to do something further, and we want to run another clause.

We then check to see if we are in the underlying graph, and if we are, we need to insert a negative statement to ensure that the triple from the underlying database will not be visible.

 new_triple(_,Y,Z,subject(X2),X2,Y,Z).  
 new_triple(X,_,Z,predicate(Y2),X,Y2,Z).  
 new_triple(X,Y,_,object(Z2),X,Y,Z2).
:- rdf_meta update(r,r,o,o,?).
update(X,Y,Z,G,_) :-  
    rdf_db:rdf(X,Y,Z,G), 
    neg(G,G3),
    rdf_assert(X,Y,Z,G3), fail. % delete previous subject if it exists and try update
update(X,Y,Z,G,Action) :- 
    pos(G,G2),
    (rdf_db:rdf(X,Y,Z,G2) -> rdf_update(X,Y,Z,G2,Action) % exists in pos graph 
     ; new_triple(X,Y,Z,Action,X2,Y2,Z2),
       rdf_assert(X2,Y2,Z2,G2)). % doesn't yet exist in pos graph (insert). 

Finally, we get to the update.  Updates take an action to define which part of the triple they want to impact.  You can see the function symbol in new_triple/7.  This predicate has to remove the old value from the underlying graph by asserting a negative - it then 'fails', similarly to the delete statement in order to see if we need to update the triple in the positive graph.  If it is not in the positive graph, we need to insert it there as the old value has been negated from the underlying graph.

Now we can query rdf1/4 and we will get behaviour that mimics the actual inserts and updates happening on the underlying store.  We can run our constraints on this rdf1 instead of rdf.  If it succeeds we bulk delete from the negative graph and bulk insert from the positive graph.

Now, if you're clever you might see how this trick could be used to implement nested transactions with post conditions!

Comments

Popular posts from this blog

Generating etags automatically when needed

Have you ever wanted M-. (the emacs command which finds the definition of the term under the cursor) to just "do the right thing" and go to the most current definition site, but were in a language that didn't have an inferior process set-up to query about source locations correctly (as is done in lisp, ocaml and some other languages with sophisticated emacs interfaces)?

Well, fret no more. Here is an approach that will let you save the appropriate files and regenerate your TAGS file automatically when things change assuring that M-. takes you to the appropriate place.

You will have to reset the tags-table-list or set it when you first use M-. and you'll want to change the language given to find and etags in the 'create-prolog-tags function (as you're probably not using prolog), but otherwise it shouldn't require much customisation.

And finally, you will need to run etags once manually, or run 'M-x create-prolog-tags' in order to get the initia…

Decidable Equality in Agda

So I've been playing with typing various things in System-F which previously I had left with auxiliary well-formedness conditions. This includes substitutions and contexts, both of which are interesting to have well typed versions of. Since I've been learning Agda, it seemed sensible to carry out this work in that language, as there is nothing like a problem to help you learn a language.

In the course of proving properties, I ran into the age old problem of showing that equivalence is decidable between two objects. In this particular case, I need to be able to show the decidability of equality over types in System F in order to have formation rules for variable contexts. We'd like a context Γ to have (x:A) only if (x:B) does not occur in Γ when (A ≠ B). For us to have statements about whether two types are equal or not, we're going to need to be able to decide if that's true using a terminating procedure.

And so we arrive at our story. In Coq, equality is som…

Formalisation of Tables in a Dependent Language

I've had an idea kicking about in my head for a while of making query plans explicit in SQL in such a way that one can be assured that the query plan corresponds to the SQL statement desired. The idea is something like a Curry-Howard in a relational setting. One could infer the plan from the SQL, the SQL from the plan, or do a sort of "type-checking" to make sure that the plan corresponds to the SQL.

The devil is always in the details however. When I started looking at the primitives that I would need, it turns out that the low level table joining operations are actually not that far from primitive SQL statement themselves. I decided to go ahead and formalise some of what would be necessary in Agda in order get a better feel for the types of objects I would need and the laws which would be required to demonstrate that a plan corresponded with a statement.

Dependent types are very powerful and give you plenty of rope to hang yourself. It's always something of…