What follows are some notes about a minimal proof-of-concept for a stochastic simulator in Prolog via a meta-interpreter.
META-INTERPRETER
Here is a Prolog meta-interpreter which supports probabilistic head clauses through the use of the p/2 predicate:
prove(true) :- !.
prove((A,B)) :- !,prove(A),prove(B).
prove(Head) :-
clause(Head,Body),
(p(Head,P)->(random(X),1-P<X);true),
prove(Body).
sim(_,0,S,S) :- !.
sim(Goal,N0,Acc0,S) :-
(prove(Goal)->Acc is Acc0+1;Acc is Acc0),
N is N0-1,
sim(Goal,N,Acc,S).
sim(Goal,N,P) :-
sim(Goal,N,0,S),
P is S/N.
Here is an example program:
red(a).
red(b).
heavy(X) :- red(X).
p(heavy(_),0.5).
p(red(a),0.25).
?- findall(X-P,(heavy(X),sim(heavy(X),10000,P)),Sols).
Sols = [a-0.1267, b-0.4882].
IMPLEMENTATION
The implementation consists of two parts, described in separate sections below.
Part 1/2: prove(Goal)
Here is the standard vanilla pure Prolog interpreter:
prove(true) :- !.
prove((A,B)) :- !,prove(A),prove(B).
prove(A) :- clause(A,B),prove(B).
The gist is that any Prolog goal is either of the form Head :- Body, or it is a concatenation of goals. I can recursively parse any given goal down to the form Head :- true, which prove(B) in the final clause will end up resolving against the first clause.
The parser works fine with user defined predicates, but it won’t work with built-ins or optimised predicated which ship with SWI Prolog. That given, ?- prove(Goal). should work just like ?- Goal. for pure Prolog.
The first step is to make prove/1 stochastic, which I do by adding one line to the final clause:
prove(true) :- !.
prove((A,B)) :- !,prove(A),prove(B).
prove(Head) :-
clause(Head,Body),
(p(Head,P)->(random(X),1-P<X);true),
prove(Body).
Whenever the head of a clause is encountered during unification, I check if p/2 applies to it, and if so, I make sure it fails at the rate (1-P) given by p(Head,P).
Part 2/2: sim(Goal,N,P)
The simulator performs a stochastic simulation of the proving process parameterised by the p/2 predicates. It is tantamount to running the prover many times and reporting the success fraction:
sim(_,0,S,S) :- !.
sim(Goal,N0,Acc0,S) :-
(prove(Goal)->Acc is Acc0+1;Acc is Acc0),
N is N0-1,
sim(Goal,N,Acc,S).
sim(Goal,N,P) :-
sim(Goal,N,0,S),
P is S/N.
SYNTAX
The interpreter is limited to pure Prolog: no meta predicates, built-ins or “optimised” predicates (e.g. member/2) because they all break clause/2. It is straightforward to expand the coverage of the interpreter, but not necessarily desirable.
Probabilities are assigned using p(Goal,P).
SEMANTICS
p/2 specifies the probability that a clause gets unified during the proof process. Each instance of p/2 is applied independently. Therefore, to interpret the results, the programmer must taken into account both the assigned probabilities and the proof process.
Here is a transitive meeting example to illustrate the point.
meets_(sarah,lucy).
meets_(lucy,steve).
meets_(lucy,kolmogorov).
meets_(steve,chris).
meets_(steve,kolmogorov).
meets_(chris,kolmogorov).
meets(X,Y) :- meets_(X,Y).
meets(X,Y) :- meets_(X,Z),meets(Z,Y).
p(meets_(sarah,_),0.25).
p(meets_(steve,chris),0.5).
p(meets_(_,kolmogorov),0.1).
Who could possibly meet Kolmogorov?
?- setof(X,meets(X,kolmogorov),People).
People = [chris, lucy, sarah, steve].
How likely are they the meet him? The p/2 assignments above should be interpreted as follows:
- P=0.25 of Sarah meeting any particular person.
- P=0.3 of Chris and Steve meeting.
- P=0.1 of Kolmogorov meeting any particular person.
Taking Lucy as an example, she might meet Kolmogorov in 3 ways:
- Directly, P=0.1.
- Steve -> Kolmogorov, P=0.1.
- Steve -> Chris -> Kolmogorov, P=0.5*0.1=0.05.
One may therefore expect P for Lucy meeting Kolmogorov to be \(0.1+0.1+0.05=0.25\), but that is not the case because these assignments are not a partition of all the possibilities:
?- sim(meets(lucy,kolmogorov),10000,P).
P = 0.231071.
What I’m actually approximating is the probability that the prover will be able to unify meets(lucy,kolmogorov) in at least one of the three ways above. The probability of failing to unify is: \[(1-0.1)*(1-0.1)*(1-0.05) = 0.7695\] therefore we should expect sim/3 to converge to \(1-0.7695 = 0.2305\) in the example; just as it does.
Now, look what happens if I assign probabilities to the meets/2 predicate rather than meets_/2:
...
p(meets(sarah,_),0.25).
p(meets(steve,chris),0.5).
p(meets(_,kolmogorov),0.1).
?- sim(meets(lucy,kolmogorov),10000,P).
P = 0.109619.
Certainly not what I want. Restrictions (p/2) on the recursive meets/2 call results in unintuitive backtracking behaviour and ultimately incorrect answers.
CONCLUSIONS
A full probabilistic logic system is difficult to write, and even more difficult to make generally performant, so I was pleasantly surprised that just a little bit of meta-programming in Prolog could go this far. In the setting of a non-trivial program, the inherent problems with the naive approach above – especially related to the dependence on backtracking – are obvious. Yet I’m encouraged that even in complicated cases a more purpose specific meta-interpreter could do the trick.
Further, I like that the semantics are not strict. There are many benefits to a proper model which results in a true joint probability distribution over the parameters / possible assignments / etc and provides theoretical guarantees, but it also forces significant structure and restriction on the final model. I spent a lot of time attempting to Bayesian all-the-things, and restriction on the form of a model can be an absolute show-stopper in my opinion.
A real requirement from not too long ago was a simulation based sales forecast for a physical product company based on historical sales information, a whole set of possible events and a set of judgement uncertainties regarding what could happen at pivotal points during a year. Thinking it over, even this naive implementation may be enough to produce a formidable model combining logic, numerical methods and stochastic simulation.
Just a first foray; to be continued.