Prolog is often compiled in a two-step process similar to the
idea of byte-code or other intermediate representations in other
programming languages.
(i) Compile into an intermediate form designed to run on a
Warren Abstract Machine (WAM), which is described below.
This code is independent of the machine the "real" version
will eventually run on, significantly reducing portability
problems.
(ii) Compile the WAM code into object code for the actual target
platform.
This two-step process allows some separation between the prolog-translation
issues and the hardware-dependent issues of particular platforms.
WAM Overview
============
The WAM is a stack based machine, where
the stack is used for building and
storing terms during the process of
translating the prolog code into
the intermediate WAM code.
We'll define the HEAP as an array of cells
(to be used as our stack).
We have prolog variables and structures to process.
Variables
=========
Each variable is stored using its address,
which is its heap index, k
and we'll write this as (REF ,k)
For a bound variable k will be the address
of the thing it's bound to.
For an unbound variable k will be the variable's
own address
Structures
==========
Each structure has the form f(a1,...,aN),
where f/N is functor and ai is the index of
the subterm in the heap
We'll write this as (STR, k) where k is the address of
the functor cell representing f/N
The N heap cells immediately following k
are then the addresses for the N subterms
A structure thus takes N+2 consecutive cells:
- the initial "pointer" to where f/N is located,
- the f/N cell representing the functor,
- the N subterm cells (each of which could,
in turn, be either a variable or a structure)
Query Translation
=================
Queries are translated into sets of instructions,
using a collection of registers to track the
current choices for parts of the term being solved.
The registers are numbered sequentially, X1, X2, etc,
have the same format as the heap cells,
and are allocated in the order encountered.
For example, for p(Z,q(Z,B),f(B))
X1 for storing p
X2 for storing Z
X3 for storing q
X4 for storing B
X5 for storing f
This is basically a flattened version of the query,
with registers allocated in the order in which
items first appear.
To translate terms, there are three different kinds of actions:
1. put a new structure, f/N, on the heap
(f is the functor and N is the arity)
2. put a new variable on the heap
3. set a variable value for a variable already on the heap
The necessary steps for each type of action are as follows:
1. Putting a new structure on the heap
for structure f/N, and using register Xi
put_structure f/N, Xi:
1a. push a new cell onto the heap (STR ,j)
where j is the cell's address
(this cell represents the term)
heap[h] = (STR ,h+1)
heap[h+1] = f/N
1b. copy the cell address, j, into the next
available register, Xi
Xi = heap[h]
1c. update the top of heap index
h += 2
2. Pushing a new variable on the heap
set_variable Xi:
2a. push a new (REF ,k) cell onto the heap,
where k is it's own address
heap[h] = (REF ,h)
2b. copy the cell address, k, into register Xi
Xi = heap[h]
2c. update the top of heap index
h++
3. Setting the value of a variable on the heap
set_value Xi:
3a. push a new cell into the heap and copy it
into the register
heap[h] = Xi
3b. update top of heap
h++
Translation example
===================
For term p(Z,q(Z,B),f(B))
Registers used:
X1 references p
X2 references Z
X3 references q
X4 references B
X5 references f
Actions to take:
put_structure for each of p, q, f
set_variable for each of Z, B
set_value for each of X2, X3, X4, X5
The order in which we build things is to
go left to right, and build each term once
we have built all its subterms, refecting the
order in which things get resolved, so:
We put_structure for q first, then f, then p.
We set_variable in the first-built term using it,
in this case while building q for both.
We set_value for variables when they are used in the
terms built later, as well as for the "results" of terms.
This gives the resulting instruction sequence:
put_structure for q
set_variable for Z
set_variable for B
put_structure for f
set_value for B
put_structure for p
set_value for Z
set_value for q
set_value for f