# A Gentle Introduction to MicroKanren

Part 1 / ?

Contents

## Overview

MicroKanren Hemann & Friedman, 2013 is a minimal relational programming language in the miniKanren minikanren.org family. It is designed to be simple and easy to extend, but more importantly, easy to understand. This post is going over a python implementation of the interpreter microKanren-py with lots of type signatures and descriptive variable names that are (usually) longer than a single character.

### What This Post Is & Isn’t

This is an explanation of the mechanics of a microKanren implementation. It’s going to outline the code of an implementation step by step with examples.

This post isn’t an introduction to relational programming Introduction to Logic Programming with Clojure - Ambrose Bonnaire-Sergeant , a tutorial on any type of Kanren minikanren.org/#writtenTutorials , or an explanation of why one might choose relational programming If you’ve ever written SQL, you’ve already sort of done some relational programming without realizing it. .

## Example

An example we’d like to be able to compute in our microKanren implementation is as follows:

 `````` 1 2 3 4 5 6 7 8 9 10 `````` `````` goal = fresh(lambda a, b: conj( eq(a, 1), disj( eq(b, 2), eq(b, 3)))) result = goal(State.empty()) assert result == [ State({'a#1': 1, 'b#1': 2}, Counter(a=1, b=1)), State({'a#1': 1, 'b#1': 3}, Counter(a=1, b=1)), ] ``````

This example covers the four primitives available in microKanren: `fresh`, `eq`, `conj`, and `disj`.

• `fresh` is used to create new variables
• `eq` is used to compare variables
• `conj` is conjunction - the “and” operation
• `disj` is disjunction - aka “or”

Here, we are asking microKanren to compute all the possible states where: the variable `a` is equal to 1 AND (the variable `b` is equal to 2 OR the variable `b` is equal to 3). This has two possible results, the first where `a` is 1 and `b` is 2, and the second where `a` is 1 and `b` is 3. These are listed out on lines 8 & 9 in the result assertion.

## The code

### Variables, Terms, Substitutions & States

The first things to define are variables, terms, substitutions, and streams. Variables are a lot like the variables you’re probably used to, except that here, once a variable is “defined” as something, that value can never change. New variables are created by calls to `fresh`. We will represent our variables as strings. In the above example, the two variables are `a` and `b`.

Variables can be “defined” as pointing to terms: this can be either another variable, or an arbitrary value. This collection of mappings of `Var`s to `term`s is stored in a dictionary of type `sub_dict`. Above, we’re only using values: 1, 2, and 3, but we also could have defined a variable as being equal to another. In our example, the `result` is a list of states.

 `````` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 `````` ``````class Var(str): pass term = Union[Var, Any] sub_dict = dict[Var, term] class State(): def __init__(self, subs: sub_dict, var_counter: Counter): self.subs = subs self.var_counter = var_counter @classmethod def empty(cls): return cls({}, Counter()) def __eq__(self, other): return isinstance(other, State) and \ self.subs == other.subs and \ self.var_counter == other.var_counter def __repr__(self): return f"State({self.subs}, {self.var_counter})" def __str__(self): return self.__repr__(self) ``````

`State` contains a a `sub_dict` to keep track of substitutions and a `Counter` to keep track of what variables have already been created. This `Counter` is to resolve collisions - remember, `fresh` always returns a new variable, even if the variable name has already been used. That’s why there a `#1` after our variable names in the `result`. For convenience, we add a `classmethod` `empty` to create an empty `State`, `__eq__` to compare `State`s to each other, and `__repr__` and `__str__` for better representations for debugging.

The usage of a dictionary for substitutions deviates from the original paper, but it makes the job of extending and walking substitutions much easier as well as making solutions easier to read. Extending a substitution is exactly what is sounds like - returning a new dictionary where we’ve added a new mapping of a `Var` to a `term` to the existing dictionary of substitutions:

 ``````1 2 `````` ``````def extend_subs(source: Var, target: Any, subs: sub_dict) -> sub_dict: return subs | {source: target} ``````

### Walk

The next thing we need is a way to `walk` variables. This means looking up variables in the substitution dictionary. If it points to another variable, look up that variable and repeat until you either get to a term or it no longer points to anything.

Examples:

 ``````1 2 3 4 `````` ``````walk('a', {}) == 'a' # empty, return the original Var walk('a', {'a': 5}) == 5 # return the term it points to walk('a', {'a': 'b', 'b': 7}) == 7 # follow the pointers and return the term it points to walk('a', {'a': 'b', 'b': 'c'}) == 'c' # this one points to another Var ``````

Implementation:

 ``````1 2 3 4 `````` ``````def walk(t: term, subs: sub_dict) -> term: if isinstance(t, Var) and t in subs: return walk(subs[t], subs) return t ``````

### Streams

We’ve already seen that a problem could have multiple solutions, where each solution is represented as a `State` object. A `stream` is that list of solutions:

``````stream = list[State]
``````

### Goals

With all that out of the way, we’re ready to start talking about the structure of the four primitives we saw earlier: `fresh`, `eq`, `conj`, and `disj`. We want all four of these to be composable so we can arrange them however we’d like. The output of each of these functions also needs to be callable, so they can be called with different `State`s. The output of that callable should be a `stream` of results. A possible definition of the goal type is:

``````goal = Callable[[State], stream]
``````

This will hopefully make sense as we get into definitions of the four primitives and try to compose them.

### Eq & Unify

The first primitive to look at is `eq`. The job of `eq` is to tell us if we can `unify` two terms, that is to say if we can declare that they can be equal given our current knowlege of the substitutions. For example Don’t worry about the calls to `fresh` and `conj` for now, just remember `fresh` creates variables and `conj` is the conjuntion, aka “AND” :

 ``````1 2 3 4 5 6 7 8 9 `````` `````` g = fresh(lambda q: eq(q, 'hello')) result = g(State.empty()) assert result == [State({'q#1': 'hello'}, Counter(q=1))] g2 = fresh(lambda q: conj( eq(q, 'hello'), eq(q, 'fail me'))) result2 = g2(State.empty()) assert result2 == [] ``````

On lines 1-2 we asked for the stream of results where the variable `q` is equal to `'hello'`. This has one solution, where we declare that `q = 'hello'`.

On lines 5-7 we ask for the stream of results where `q` is equal to `'hello'`, but also to `'fail me'`. In this case, it is impossible to reconcile the two; `q` cannot be equal to `'fail me'` when it is already equal to `'hello'`. There are no possible substitutions that satisfy the constraints we’ve given, so we expect the empty stream as a result.

Our definition of `eq` is as follows:

 ``````1 2 3 4 5 `````` ``````def eq(u: term, v: term) -> goal: def _eq(state: State) -> stream: subs = unify(u, v, state.subs) return [State(subs, state.var_counter)] if subs else [] return _eq ``````

Remember, we want our primitives to return `goal`s, which are functions that accept a `State` and return a `stream`. `eq` returns its inner function `_eq` that accepts a `State` object as an argument, and returns a new stream containing the `State` with the result of attempting to `unify` the two terms with the `sub_dict`. The real comparison happens in the `unify` function.

We are going to use a very simple implementation of `unify`. For a more in depth explanation, Peter Norvig norvig.com has written about a more sophisticated Prolog swi-prolog.org implementation in PAIP “Paradigms of Artificial Intelligence Programming”, 1992 - see chapter 11.2 for unification . Others have elaborated on it inaimathi.ca/posts/how-unification-works as well.

Here is our definition of `unify`, followed by an explanation of what’s going on:

 `````` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 `````` ``````def unify(u: term, v: term, s: sub_dict) -> Optional[sub_dict]: u_target = walk(u, s) v_target = walk(v, s) if isinstance(u_target, Var) and isinstance(v_target, Var) and u_target == v_target: return s elif isinstance(u_target, Var): return extend_subs(u_target, v_target, s) elif isinstance(v_target, Var): return extend_subs(v_target, u_target, s) elif u_target == v_target: return s return None ``````

The first thing we do is `walk` both terms. If they have already been defined as being equal to something else, this will tell us. We now want to see if the `term`s they walked to can be equal, so we check the the 5 possible cases:

1. `u` and `v` both point to the same `Var`. This is fine, our `sub_dict` is good as it is and we don’t need to extend it. Return the `sub_dict` unchanged.
2. `u` points to a `Var`, `v` points to either a different `Var` or some arbitrary `term`. This is also fine, but we need to extend the `sub_dict` with the knowledge that `u` and `v` are now equal.
3. This is the same as (2) above, but with `u` and `v` swapped.
4. `u` and `v` both walk to the same arbitrary `term`. Like (1), this is fine and we can again return the `sub_dict` as it is.
5. If none of the above are true, it means that `u` and `v` both walked to different `term`s. This is a failure, and means that there is no possible way to equate `u` and `v`. In this case, return None as there are no solutions.

### Fresh

`fresh` is our second primitive, it is how we create new variables. Our definition of `fresh` is another place where we deviate from the original paper; in the paper, `fresh` only creates one variable at a time and stores them as numeric IDs. This implementation allows you to define multiple variables at once and stores them as human readable names for ease of understanding the outputted solutions.

We’ve seen `fresh` before, but here’s a new example:

 ``````1 2 3 4 5 6 `````` `````` g = fresh(lambda x, y: conj(eq(x, 'hello'), eq(y, x))) result = g(State.empty()) assert result == [State({'x#1': 'hello', 'y#1': 'hello'}, Counter(x=1, y=1))] ``````

Here, we’ve asked `fresh` to create two variables for us, `x` and `y`, so it’s extracted the variable names from the lambda and used them as strings. One thing to note is that `fresh` returns new variables everytime it’s called, so what happens if you use the same variable name multiple times? In the papers implementation, this isn’t a problem because it’s uses an auto incrementing counter as the variable, but we might have name collisions. We’d like our implementation of `fresh` to be able to account for this by tweaking the variable name to avoid collisions:

 ``````1 2 3 4 5 `````` ``````g = conj( fresh(lambda x: eq(x, 'hello')), fresh(lambda x: eq(x, 'bye'))) result = g(State.empty()) assert result == [State({'x#1': 'hello', 'x#2': 'bye'}, Counter({'x': 2}))] ``````

There were two calls to `fresh` asking to create a variable `x`, but the second call should see that there’s already an `x` and update the name to `x#2`.

Here is the implementation of `fresh`:

 ``````1 2 3 4 5 6 7 8 `````` ``````def fresh(f: Callable[[VarArg(term)], goal]) -> goal: def _fresh(state: State) -> stream: term_names = [str(term) for term in inspect.signature(f).parameters] state.var_counter.update(term_names) terms = [Var(f"{term}#{state.var_counter[term]}") for term in term_names] return f(*terms)(state) return _fresh ``````

Line 3 uses pythons `inspect` library `inspect` documentation to get the names of the arguments of the `lambda` to use as variable names. Lines 3 & 4 check every variable name for collisions with existing variables in our `State`, update the counter for the new variables, and create the new `Vars` with updated names. We then call the lambda with our new terms. The body of the lambda should return a `goal`. We then call that `goal` with the updated `State`, and that completes the job of `fresh`.

### Disj

`disj` is the third of our primitives. It is the disjunction, the logical `OR` of two or more goals. Unlike the usual boolean `OR` we’re used to, we can’t have falsy states, we can only have valid states that meet the constraints we’ve set up or we can have no states at all. Since any state that goes into a `disj` is valid, all we need to do is return all the states that `disj` computes by applying the state to the `goal`s it received.

The original paper only accepted two arguments for `disj` and `conj`, we support an arbitrary number here that we `reduce`.

Example:

 `````` 1 2 3 4 5 6 7 8 9 10 `````` ``````def test_disj(): g = fresh(lambda a: disj(eq(a, 'hello'), eq(a, 'hi'), eq(a, 12))) result = g(State.empty()) assert result == [ State({'a#1': 'hello'}, Counter(a=1)), State({'a#1': 'hi'}, Counter(a=1)), State({'a#1': 12}, Counter(a=1)) ] ``````

Implementation:

 `````` 1 2 3 4 5 6 7 8 9 10 11 `````` ``````def disj(*goals: goal) -> goal: if len(goals) < 2: raise Exception("disj needs 2 or more goals") def _disj(state: State) -> stream: streams = [g(state) for g in goals] return reduce(mplus, streams) return _disj def mplus(s1: stream, s2: stream) -> stream: return s1 + s2 ``````

`disj` takes two or more goals as arguments. First it checks that it really did receive enough goals as arguments, then it returns its inner `goal` function. That inner functio evaluates all the goals it’s been passed with the current state, then combines all the lists of solutions using the `mplus` function.

### Conj

The last primitive is `conj`, the conjunction of two of more goals This is the part of this paper that confused me the most by far. Hopefully I can do a better job explaining this, but it didn’t really make sense to me until I sat down and stepped through a bunch of examples on paper until I could convince myself this really works so if it’s not clear by the end of this explanation I’d encourage you to do the same. . What we want this to mean with `stream`s of results is that the `conj` of several `stream`s succeeds if there is a `State` in each `stream` such that they can all be valid at the same time. We’re interested in finding every possible set of `State`s where that holds true.

Example:

 ``````1 2 3 4 5 6 7 8 `````` ``````def test_conj(): g = conj( fresh(lambda a: eq(a, 'hi')), fresh(lambda b: eq(b, 42)), fresh(lambda c: eq(c, 'hello'))) result = g(State.empty()) assert result == [State({'a#1': 'hi', 'b#1': 42, 'c#1': 'hello'}, Counter(a=1, b=1, c=1))] ``````

The way we want to accomplish this is by taking our first goal and computing all of it’s results. We then want iterate over those results, and try to evaluate our second goal using each result as the state.

Implementation:

 `````` 1 2 3 4 5 6 7 8 9 10 11 12 13 `````` ``````def conj(*goals: goal) -> goal: if len(goals) < 2: raise Exception("conj needs 2 or more goals") def _conj(state: State) -> stream: return reduce(bind, goals[1:], goals(state)) return _conj def bind(s: stream, g: goal) -> stream: if not s: return [] else: return mplus(g(s), bind(s[1:], g)) ``````

This implemenatation is a bit trickier. `conj` starts off like `disj` by making sure it has enough goals. The inner function is again reducing over all the goals, but this time we start by taking the first goal and computing its output stream (that’s the third argument of `reduce` on line 6). We then apply `bind` to that `stream` of results and the next `goal`, and reduce all the `goal`s over this until we’re left with a single `stream` of results.

`bind` is where the actual `conj`ing happens. First we check if the `stream` has any values in it. You can think of conjunction like a set intersection here, the conjunction / intersection of an empty set of results with any other set of results is still the empty set, so we return an empty list `[]`

Then on line 13, we calculate `g(s)`, the goal applied to the first result in the `stream`, and `bind(s[1:], g)` which recursively calls `bind` again with all the results in stream except the first one and the same goal `g`. We then combine these lists of results with `mplus`. This way, we run goal `g` against all the results in `stream` `s` to get all the possible pairs of results.

If we had three or more goals to `conj`, the `reduce` will take that `stream` of results and continue to `bind` them with the rest of the `goal`s, and return the final `stream`.

### Conclusion & Future Posts

With those four primitives, you should be able to get some very basic examples running. Unfortunately, we’ve only covered about half the paper so far - I was hoping to go over the entire thing but this was getting a little long and if I kept going I was afraid I’d have to take “Gentle” out of the title.

In an eventual part two, I’d like to go over the rest of the paper and cover the changes you have to make to able to handle infinite streams and stream interleaving, as well as modifying this interpreter to use pythons iterators `Iterator Type` documentation so the entire interpreter can be lazy.