Here we will consider some key issues with respect to von Neumann architectures, and their relevance in data manipulation and program control structures.
Core components:
The core components of a computer system in a von Neuman architecture are:
Program execution cycles:
The core steps in the execution of a program are as follows:
The CPU also updates the program counter so that it will point to the next instruction
(For large instructions it might be necessary to run through several of these fetch cycles)
Note that there is always a program running on the computer: typically the operating system software includes a program which runs the entire time the computer is active
It is responsible for things like
Common memory addressing modes:
All the executable instructions and data for a program
are stored within the computer memory while the program executes
The machine code instructions supported by the computer control logic typically allow only a limited number of different methods to access data in memory (aka addressing modes).
Since the data accesses and control structures of higher level languages must eventually be carried out by sequences of such machine code instructions, it is worthwhile briefly reviewing the common data access methods:
Generally the compiler determines which data items (variables) are allocated to the available registers, but some languages (such as C) allow the programmer to encourage the compiler to allocate specific variables to registers
this is the common access method when reading/assigning static variable values that didn't get allocated to a register
this is a common access method when dealing with records (or structs or classes etc) where fields are located as an offset from the starting address of the structure
it is also a common access method to stack-dynamic data, i.e. variables and parameters whose addresses are recorded as offsets from the top of the stack
the slowest method yet, here we must make one fetch to get the address for the second fetch (obtaining the desired data)
this commonly occurs when pointers (and heap dynamic variables) are used, assuming the pointer variable isn't currently copied in a register
Being aware of the implementation issues associated with the language constructs you use can make you a more effective programmer, but also be aware of the relative importance of readability and efficiency for your project.
On a related topic, it is also useful to be aware of the relative execution times required for different kinds of operation.
The table below might give ball park figures for the relative speeds of different kinds of operation (this is highly dependent on the hardware and operating system):
When source code from high level languages such as C, C++, Pascal, Ada, etc are compiled into machine code, the subroutine calls and returns are translated into sequences which include instructions to push data onto the stack and retrieve data from the stack.
Similarly, references to variables, constants, and parameters within the source code are translated into machine code sequences with appropriate accesses into the stack.
As we shall see, some of the actions which must be carried out by the machine code sequences are quite simple, while others are significantly more advanced.
We will start with the basic call and return features, then add parameter passing, return values, local variable allocation, and references to non-local variables.
Simple calls and returns
When the compiler translates a subroutine call, the resulting machine
code will carry out a set of actions similar to:
(NOTE: pushes and pops automatically adjust the top-of-stack pointer)
This will be used for cleaning up the stack once the subroutine completes, and for accessing ancestors when dealing with scoping issues.
In your text this is referred to as a dynamic link.
So, with execution now begun in the called routine, we can logically view the stack as follows (assuming the stack grows "upward"):
| | +-----------------------------+<--- Top of stack pointer | old top-of-stack pointer | +-----------------------------+ | return address (old PC) | +-----------------------------+ | run time stack contents | | from already-active | | routines |Eventually the called routine will complete, and the actions to be carried out at that point (again, compiled as a sequence of machine code instructions), include:
The stack, program counter, and top-of-stack pointer now look exactly as they should to continue with execution.
Adding parameter passing and return values
The actions above did not consider how to pass values between the
calling and called routines, so we now add some additional steps to
the process.
When the subroutine call is made the action sequence may look like:
The value pushed onto the stack depends on the reference type: pass-by-value may be a copy of the actual data, pass-by-reference may be a memory address, etc.
(Where a pass-by-result parameter is used, a default (garbage) value may be pushed.)
If the function call was something like
x = foo(MyArray, MiddleInitial);
the stack might look something like:
| | +-----------------------------+<--- Top of stack pointer | copy of MiddleInitial | +-----------------------------+ | | | copy of all the contents | | of the array MyArray | | | +-----------------------------+ | space for return value | +-----------------------------+ | old top-of-stack pointer | +-----------------------------+ | return address (old PC) | +-----------------------------+ | run time stack contents | | from already-active | | routines |When the subroutine completes the same cleanup process is invoked, but now any necessary values must also be copied back:
As far as our stack manipulation is concerned, this has the effect of popping all the parameters off at once
(the data is still in the same memory locations, but as far as the stack is concerned it's just garbage that will be overwritten with the next push operations)
(Dynamic) local variables
This still doesn't allow for dynamic local variables, which must somehow be
allocated
with the information for the current function call.
To do so, we again add more steps when the subroutine is called:
if there are default values or initialization values then those can be pushed, otherwise the stack pointer can be adjusted to create the needed space
Note that if the stack pointer is adjusted to create space then the value of the uninitialized variable is whatever happened to be sitting in that memory location previously - hence the danger in using uninitialized variables.
If the called subroutine has local variables y, z
then the stack after the call to foo(MyArray,MiddleInitial)
might look like:
| | +-----------------------------+<--- Top of stack pointer | space for variable z | +-----------------------------+ | space for variable y | +-----------------------------+ | copy of MiddleInitial | +-----------------------------+ | | | copy of all the contents | | of the array MyArray | | | +-----------------------------+ | space for return value | +-----------------------------+ | old top-of-stack pointer | +-----------------------------+ | return address (old PC) | +-----------------------------+ | run time stack contents | | from already-active | | routines |Upon completion, the sequence looks the same as in our previous version:
Recursive calls
Note that the mechanism described above completely supports nested function
calls
and recursive function calls.
Suppose we have the following recursive factorial function:
int factorial(int N) // line 1 { // line 2 int result; // line 3 if (N < 3) result = N; // line 4 else { // line 5 result = Factorial(N-1); // line 6 result = N * result; // line 7 } // line 8 return(result); // line 9 } // line 10If we call
factorial(5)
, which in turn calls
factorial(4)
which calls factorial(3)
,
then the stack might look something like:
| | +-----------------------------+<--- Top of stack pointer | variable result | +-----------------------------+ | copy of value N == 3 | Activation +-----------------------------+ record | return value (will be 3) | for +-----------------------------+ factorial(3) | old top-of-stack pointer |--+ +-----------------------------+ | points | return (address of line 8) | | to +-----------------------------+<-+ here ---------------- | variable result | +-----------------------------+ | copy of value N == 4 | Activation +-----------------------------+ record | return value (will be 12) | for +-----------------------------+ factorial(4) | old top-of-stack pointer |--+ +-----------------------------+ | points | return (address of line 8) | | to +-----------------------------+<-+ here ---------------- | variable result | +-----------------------------+ | copy of value N == 5 | +-----------------------------+ Activation | return value (will be 60) | record +-----------------------------+ for | old top-of-stack pointer |--+ factorial(5) +-----------------------------+ | points | return (address of line 8) | | to +-----------------------------+<-+ here ---------------- | run time stack contents | | from already-active | | routines |During each execution of the factorial routine, accesses to N and result are done via offsets from the current stack pointer, so even though the execution instruction sequence is similar the data values being accessed are different.
Referencing non-local variables
One last issue to address is how we access non-local variables.
This can take two forms: blocks within subroutines, and nested subroutine declarations (where access to non-local (but possibly non-global) variables is determined by the static program structure.
The problem in the latter case is that, while the structure is known statically, we need to ensure that the specific instance referred to is the correct one.
For instance, suppose we have a language (such as Pascal) that allows nested
declarations of functions, and that function blah
is declared
within
function foo
.
Now, suppose foo
calls itself recursively, and then the more
recent call
to foo
calls blah
.
Within blah
we should have access to the (non-local) variables
of foo
,
but specifically to the most recent call of foo
.
This means that from a called subroutine we must be able to identify not only which routines are their static ancestors, but also to identify the most recent stack activation records for each of those ancestors.
Consider the following skeleton for a program with nested declarations:
program main procedure A-inside-main procedure B-inside-A // B statements end-B procedure C-inside-A procedure D-inside-C // D statements end-D // C statements, // including calls to D end-C // A statements, // including calls to B and C end-A // main statements, // including calls to A end-mainWe will use an additional stack value with each subroutine to point to the static ancestor of the routine.
This might be pushed immediately after the copy of the old stack pointer.
Suppose procedure D is called from procedure C above, then the stack might look like:
| | +-----------------------------+<--- Top of stack pointer | space for D's locals | +-----------------------------+ | space for D's parameters | +-----------------------------+ | space for D's return value | +-----------------------------+ static link points to the most | ptr to D's static ancestor | recent activation for C, in this +-----------------------------+ case would be same as old t-o-s ptr | old top-of-stack pointer | +-----------------------------+ | return address (old PC) | +-----------------------------+ | run time stack contents | | from already-active | | routines |When checking for non-local references:
In fact, the distance along the chain can be determined at compile time, which considerably simplifies the implementation
(and the offset can also be determined at compile time)
Creating scopes for blocks
One way to create a new scope for a block is simply to treat it as a subroutine call with no parameters and no return values - though this adds much of the overhead of subroutine calls without making that apparent to the programmer.
An alternative is, for each subroutine, identify how much space can be required for block variables at any one time and allocate the block space similarly to the rest of the local variable space.
The compiler can then determine appropriate offsets for block variables, knowing that conflicts cannot arise between requests in separate blocks.
For example, consider the code fragment
int foo(int b) { int a; for (int x = 1; ...) { for (int y = 0; ...) { } } for (int z = 0; ...) { } }In this, we only need space for two block variables, since z is never active at the same time as x and y
As a result, the activation record might look something like:
| | +-----------------------------+<--- Top of stack pointer | space for block var y | +-----------------------------+ | space for block vars x, z | +-----------------------------+ | space for local var a | +-----------------------------+ | space for parameter b | +-----------------------------+ | space for return value | +-----------------------------+ | ptr to foo's static ancestor| +-----------------------------+ | old top-of-stack pointer | +-----------------------------+ | return address (old PC) | +-----------------------------+ | run time stack contents | | from already-active | | routines |