Part 1 / ?
MicroKanren Hemann & Friedman, 2013 is a minimal relational programming language in the miniKanren minikanren.org family. It is designed to be simple and easy to extend, but more importantly, easy to understand. This post is going over a python implementation of the interpreter microKanren-py with lots of type signatures and descriptive variable names that are (usually) longer than a single character.
What This Post Is & Isn’t
This is an explanation of the mechanics of a microKanren implementation. It’s going to outline the code of an implementation step by step with examples.
This post isn’t an introduction to relational programming Introduction to Logic Programming with Clojure - Ambrose Bonnaire-Sergeant , a tutorial on any type of Kanren minikanren.org/#writtenTutorials , or an explanation of why one might choose relational programming If you’ve ever written SQL, you’ve already sort of done some relational programming without realizing it. .
An example we’d like to be able to compute in our microKanren implementation is as follows:
This example covers the four primitives available in microKanren:
freshis used to create new variables
eqis used to compare variables
conjis conjunction - the “and” operation
disjis disjunction - aka “or”
Here, we are asking microKanren to compute all the possible states where: the variable
a is equal to 1 AND (the variable
b is equal to 2 OR the variable
b is equal to 3). This has two possible results, the first where
a is 1 and
b is 2, and the second where
a is 1 and
b is 3. These are listed out on lines 8 & 9 in the result assertion.
Variables, Terms, Substitutions & States
The first things to define are variables, terms, substitutions, and streams. Variables are a lot like the variables you’re probably used to, except that here, once a variable is “defined” as something, that value can never change. New variables are created by calls to
fresh. We will represent our variables as strings. In the above example, the two variables are
Variables can be “defined” as pointing to terms: this can be either another variable, or an arbitrary value. This collection of mappings of
terms is stored in a dictionary of type
sub_dict. Above, we’re only using values: 1, 2, and 3, but we also could have defined a variable as being equal to another. In our example, the
result is a list of states.
State contains a a
sub_dict to keep track of substitutions and a
Counter to keep track of what variables have already been created. This
Counter is to resolve collisions - remember,
fresh always returns a new variable, even if the variable name has already been used. That’s why there a
#1 after our variable names in the
result. For convenience, we add a
empty to create an empty
__eq__ to compare
States to each other, and
__str__ for better representations for debugging.
The usage of a dictionary for substitutions deviates from the original paper, but it makes the job of extending and walking substitutions much easier as well as making solutions easier to read. Extending a substitution is exactly what is sounds like - returning a new dictionary where we’ve added a new mapping of a
Var to a
term to the existing dictionary of substitutions:
The next thing we need is a way to
walk variables. This means looking up variables in the substitution dictionary. If it points to another variable, look up that variable and repeat until you either get to a term or it no longer points to anything.
We’ve already seen that a problem could have multiple solutions, where each solution is represented as a
State object. A
stream is that list of solutions:
stream = list[State]
With all that out of the way, we’re ready to start talking about the structure of the four primitives we saw earlier:
disj. We want all four of these to be composable so we can arrange them however we’d like. The output of each of these functions also needs to be callable, so they can be called with different
States. The output of that callable should be a
stream of results. A possible definition of the goal type is:
goal = Callable[[State], stream]
This will hopefully make sense as we get into definitions of the four primitives and try to compose them.
Eq & Unify
The first primitive to look at is
eq. The job of
eq is to tell us if we can
unify two terms, that is to say if we can declare that they can be equal given our current knowlege of the substitutions. For example
Don’t worry about the calls to
conj for now, just remember
fresh creates variables and
conj is the conjuntion, aka “AND”
On lines 1-2 we asked for the stream of results where the variable
q is equal to
'hello'. This has one solution, where we declare that
q = 'hello'.
On lines 5-7 we ask for the stream of results where
q is equal to
'hello', but also to
'fail me'. In this case, it is impossible to reconcile the two;
q cannot be equal to
'fail me' when it is already equal to
'hello'. There are no possible substitutions that satisfy the constraints we’ve given, so we expect the empty stream as a result.
Our definition of
eq is as follows:
Remember, we want our primitives to return
goals, which are functions that accept a
State and return a
eq returns its inner function
_eq that accepts a
State object as an argument, and returns a new stream containing the
State with the result of attempting to
unify the two terms with the
sub_dict. The real comparison happens in the
We are going to use a very simple implementation of
unify. For a more in depth explanation, Peter Norvig
has written about a more sophisticated Prolog
implementation in PAIP
“Paradigms of Artificial Intelligence Programming”, 1992 - see chapter 11.2 for unification
. Others have elaborated on it
Here is our definition of
unify, followed by an explanation of what’s going on:
The first thing we do is
walk both terms. If they have already been defined as being equal to something else, this will tell us. We now want to see if the
terms they walked to can be equal, so we check the the 5 possible cases:
vboth point to the same
Var. This is fine, our
sub_dictis good as it is and we don’t need to extend it. Return the
upoints to a
vpoints to either a different
Varor some arbitrary
term. This is also fine, but we need to extend the
sub_dictwith the knowledge that
vare now equal.
- This is the same as (2) above, but with
vboth walk to the same arbitrary
term. Like (1), this is fine and we can again return the
sub_dictas it is.
- If none of the above are true, it means that
vboth walked to different
terms. This is a failure, and means that there is no possible way to equate
v. In this case, return None as there are no solutions.
fresh is our second primitive, it is how we create new variables. Our definition of
fresh is another place where we deviate from the original paper; in the paper,
fresh only creates one variable at a time and stores them as numeric IDs. This implementation allows you to define multiple variables at once and stores them as human readable names for ease of understanding the outputted solutions.
fresh before, but here’s a new example:
Here, we’ve asked
fresh to create two variables for us,
y, so it’s extracted the variable names from the lambda and used them as strings. One thing to note is that
fresh returns new variables everytime it’s called, so what happens if you use the same variable name multiple times? In the papers implementation, this isn’t a problem because it’s uses an auto incrementing counter as the variable, but we might have name collisions. We’d like our implementation of
fresh to be able to account for this by tweaking the variable name to avoid collisions:
There were two calls to
fresh asking to create a variable
x, but the second call should see that there’s already an
x and update the name to
Here is the implementation of
Line 3 uses pythons
to get the names of the arguments of the
lambda to use as variable names. Lines 3 & 4 check every variable name for collisions with existing variables in our
State, update the counter for the new variables, and create the new
Vars with updated names. We then call the lambda with our new terms. The body of the lambda should return a
goal. We then call that
goal with the updated
State, and that completes the job of
disj is the third of our primitives. It is the disjunction, the logical
OR of two or more goals. Unlike the usual boolean
OR we’re used to, we can’t have falsy states, we can only have valid states that meet the constraints we’ve set up or we can have no states at all. Since any state that goes into a
disj is valid, all we need to do is return all the states that
disj computes by applying the state to the
goals it received.
The original paper only accepted two arguments for
conj, we support an arbitrary number here that we
disj takes two or more goals as arguments. First it checks that it really did receive enough goals as arguments, then it returns its inner
goal function. That inner functio evaluates all the goals it’s been passed with the current state, then combines all the lists of solutions using the
The last primitive is
conj, the conjunction of two of more goals
This is the part of this paper that confused me the most by far. Hopefully I can do a better job explaining this, but it didn’t really make sense to me until I sat down and stepped through a bunch of examples on paper until I could convince myself this really works so if it’s not clear by the end of this explanation I’d encourage you to do the same.
. What we want this to mean with
streams of results is that the
conj of several
streams succeeds if there is a
State in each
stream such that they can all be valid at the same time. We’re interested in finding every possible set of
States where that holds true.
The way we want to accomplish this is by taking our first goal and computing all of it’s results. We then want iterate over those results, and try to evaluate our second goal using each result as the state.
This implemenatation is a bit trickier.
conj starts off like
disj by making sure it has enough goals. The inner function is again reducing over all the goals, but this time we start by taking the first goal and computing its output stream (that’s the third argument of
reduce on line 6). We then apply
bind to that
stream of results and the next
goal, and reduce all the
goals over this until we’re left with a single
stream of results.
bind is where the actual
conjing happens. First we check if the
stream has any values in it. You can think of conjunction like a set intersection here, the conjunction / intersection of an empty set of results with any other set of results is still the empty set, so we return an empty list
Then on line 13, we calculate
g(s), the goal applied to the first result in the
bind(s[1:], g) which recursively calls
bind again with all the results in stream except the first one and the same goal
g. We then combine these lists of results with
mplus. This way, we run goal
g against all the results in
s to get all the possible pairs of results.
If we had three or more goals to
reduce will take that
stream of results and continue to
bind them with the rest of the
goals, and return the final
Conclusion & Future Posts
With those four primitives, you should be able to get some very basic examples running. Unfortunately, we’ve only covered about half the paper so far - I was hoping to go over the entire thing but this was getting a little long and if I kept going I was afraid I’d have to take “Gentle” out of the title.
In an eventual part two, I’d like to go over the rest of the paper and cover the changes you have to make to able to handle infinite streams and stream interleaving, as well as modifying this interpreter to use pythons iterators
Iterator Type documentation
so the entire interpreter can be lazy.