Names, types, binding, and scope

Names and named components

Within a language there are typically a number of symbols and keywords associated with language features and constructs.

Special words, the words used to name actions or control forms within a language, may be keywords, reserved words, or predefined words:

User-defined names: may be applied to:

In addition to the use to which a user-defined name is put, different languages give users different degrees of freedom in determining valid names (or identifiers).

Among the choices a language designer must make are:

Later in the semester we will consider the use of identifiers with labels and subroutines, first we will considerer variables in some detail.

Variables: types, binding, and scope

A program variable encompasses a number of attributes, each of which is implicitly or explicitly defined according to the language characteristics:

Note that the variable address and value are sometimes referred to as its l-value and r-value, respectively.

Binding of attributes to variables: depending on the language, the attributes mentioned might exist for the entire life of the variable, or might change over time.

To clarify when the attributes of a variable take effect, we use the concept of binding.

A binding is static if the variable attribute is fixed before run time, and is unchanged throughout program execution.

A binding is dynamic, on the other hand, if the attribute can change at some point during execution.

Consider the different variable attributes with respect to binding:

Bindings may be explicitly declared by the user, or may be implicitly declared through the rules and conventions of the language, applied to the usage of the variable in the program itself.

Consider the following C++ code segment:

#include <iostream.h>

int mysquare(int x);

int y;

void main()
{
   cout << "Please enter an integer" << endl;
   cin >> y;
   cout << "The square of " << y << " is ";
   y = mysquare(y);
   cout << y << endl;
}

int mysquare(int x)
{
   int result;
   result = x * x;
   return(result);
}
Variable x has

While variable y has

Most programming languages require explicit declaration of variables - supplying at least the name, usually the type, and occasionally an initial value for the variable.

Some languages, such as PERL, FORTRAN, and BASIC, allow implicit declarations: when a variable is first used it is automatically or implicitly declared, and language rules are applied to attempt to derive the other attributes (value, type, etc).

Explicit declarations guarantee the compiler has complete information with which to apply type and error checking, but place extra restrictions on the programmer.

Languages which use dynamic type binding do not assign a type to a variable until a value is assigned to the variable: the type that is bound is one appropriate to the value assigned.

In some cases, the type can also be dynamically changed - e.g. you assign a variable an integer value at one point, and a string value at some later point.

Dynamic typing makes a language much more flexible, but has several disadvantages:

In C++ variables are statically typed, however some implicit type conversion takes place at run time when the type of an evaluated value (e.g. the right hand side of an assignment statement) does not match the expected type (i.e. the left hand side the of statement). This causes some of the same complications as dynamic typing.

Variable lifetimes: the lifetime of a variable is typically referred to as the period during which it has storage space allocated to it.

We will consider three classifications of variables, based on the way in which storage locations are bound to the variables:

Type Checking: is the process of ensuring that the operands of an operator are of the correct type.

A compatible type is one that is legal for the operator, or one which may be implicity converted (or coerced) into a legal type.

Types which are not compatible provoke type errors.

Type checking is most efficiently carried out prior to execution, but is not possible when dynamic type binding is allowed, or in cases (such as C++ unions) where the same memory location is permitted to store values of different data types at different times during execution.

A programming language is strongly typed if type errors are always detected.

We consider two types of type compatibility: name type compatibility and structure type compatibility.

In fact, object-oriented languages also face the issue of object compatibility, but this will be addressed later in the semester.

Variable scopes: the scope of a variable is the range of program statements in which the variable is "visible".

For example, the scope of a variable declared within a C++ function is local to that function - it cannot be referenced from outside the function.

The local variables of a program unit or block are the variables which are visible within the block and which are also declared within that block.

The nonlocal variables of a program unit or block are the variables which are visible within the block but which are not declared within it.

Static scoping means the scopes of variables are identified prior to run time, whereas dynamic scoping means variable scopes are identified during execution.

In languages like Pascal, subprograms can be nested, creating a heirarchy of scopes.

If a language uses static scoping, then it is possible prior to execution time (e.g. by the compiler) to determine which variable is referenced by the use of an identifier at any point in the code.

In the case of C++, if an identifier matches a local variable (or parameter) then it is assumed that variable is the one being referenced, otherwise the match is to any global variable using the identifier. (Though you can force a match to the global variable, bypassing the local, by preceding the identifier name with "::".)

For example:

int Result;    // global variable

int AddThree(int y)
{
   int result;        // local variable

   result = y + 3;    // assign y+3 to local variable

   // now add contents of global variable to local variable
   result = result ::result; 

   return(result);
}
In addition to allowing variables to be local to a subroutine, some languages allow variables to local to a block.

In C++, for example, a variable can be declared within a compound statement such as a for loop, while loop, or if statement:

 // assorted code
 while (x < 3) {
    int i; // i is visible only in the while loop
    // assorted code
 }
 // and more assorted code
(Actually, you'll find some non-standard C++ compilers fail to support these scoping rules.)

Dynamic scoping is supported in some versions of APL, LISP, and SNOBOL.

In dynamic scoping, the scope rules are based on the calling sequence of subroutines, not on the way in which they are "structurally" nested.

If an identifier does not match any variables in the current function, we search the function which called it to find any matching variables, then the function which called that, etc until a match is found.

Thus, an identifier in a particular function statement can refer to completely different variables in different executions of the same program!

Obviously this can raise significant concerns with respect to readability and reliability, and can require significantly more run time checking.

However, they can eliminate a great deal of parameter passing, since the relevant variables (and hence values) are implicitly visible to the called routine!