Prolog is often compiled in a two-step process similar to the idea of byte-code or other intermediate representations in other programming languages. (i) Compile into an intermediate form designed to run on a Warren Abstract Machine (WAM), which is described below. This code is independent of the machine the "real" version will eventually run on, significantly reducing portability problems. (ii) Compile the WAM code into object code for the actual target platform. This two-step process allows some separation between the prolog-translation issues and the hardware-dependent issues of particular platforms. WAM Overview ============ The WAM is a stack based machine, where the stack is used for building and storing terms during the process of translating the prolog code into the intermediate WAM code. We'll define the HEAP as an array of cells (to be used as our stack). We have prolog variables and structures to process. Variables ========= Each variable is stored using its address, which is its heap index, k and we'll write this as (REF ,k) For a bound variable k will be the address of the thing it's bound to. For an unbound variable k will be the variable's own address Structures ========== Each structure has the form f(a1,...,aN), where f/N is functor and ai is the index of the subterm in the heap We'll write this as (STR, k) where k is the address of the functor cell representing f/N The N heap cells immediately following k are then the addresses for the N subterms A structure thus takes N+2 consecutive cells: - the initial "pointer" to where f/N is located, - the f/N cell representing the functor, - the N subterm cells (each of which could, in turn, be either a variable or a structure) Query Translation ================= Queries are translated into sets of instructions, using a collection of registers to track the current choices for parts of the term being solved. The registers are numbered sequentially, X1, X2, etc, have the same format as the heap cells, and are allocated in the order encountered. For example, for p(Z,q(Z,B),f(B)) X1 for storing p X2 for storing Z X3 for storing q X4 for storing B X5 for storing f This is basically a flattened version of the query, with registers allocated in the order in which items first appear. To translate terms, there are three different kinds of actions: 1. put a new structure, f/N, on the heap (f is the functor and N is the arity) 2. put a new variable on the heap 3. set a variable value for a variable already on the heap The necessary steps for each type of action are as follows: 1. Putting a new structure on the heap for structure f/N, and using register Xi put_structure f/N, Xi: 1a. push a new cell onto the heap (STR ,j) where j is the cell's address (this cell represents the term) heap[h] = (STR ,h+1) heap[h+1] = f/N 1b. copy the cell address, j, into the next available register, Xi Xi = heap[h] 1c. update the top of heap index h += 2 2. Pushing a new variable on the heap set_variable Xi: 2a. push a new (REF ,k) cell onto the heap, where k is it's own address heap[h] = (REF ,h) 2b. copy the cell address, k, into register Xi Xi = heap[h] 2c. update the top of heap index h++ 3. Setting the value of a variable on the heap set_value Xi: 3a. push a new cell into the heap and copy it into the register heap[h] = Xi 3b. update top of heap h++ Translation example =================== For term p(Z,q(Z,B),f(B)) Registers used: X1 references p X2 references Z X3 references q X4 references B X5 references f Actions to take: put_structure for each of p, q, f set_variable for each of Z, B set_value for each of X2, X3, X4, X5 The order in which we build things is to go left to right, and build each term once we have built all its subterms, refecting the order in which things get resolved, so: We put_structure for q first, then f, then p. We set_variable in the first-built term using it, in this case while building q for both. We set_value for variables when they are used in the terms built later, as well as for the "results" of terms. This gives the resulting instruction sequence: put_structure for q set_variable for Z set_variable for B put_structure for f set_value for B put_structure for p set_value for Z set_value for q set_value for f