There are two kinds of abstraction we are concerned with:
this was considered in the previous sections on control structures and subprograms,
in which different data entities can be logically grouped by their set of common attributes, and distinguished by their differing attributes:
For example, we can group all stacks together as a data abstraction with common operations (pop, push, etc), and only address their differences (such as the data types which can be stored on the stack) when and where appropriate.
Both process and data abstraction contribute to our ability to modularize code:
An abstract data type is a group of program units and data items, that only includes the data representation for one specific type of data and the subprograms that provide the operations for manipulating that data type.
The distinction between this an the traditional view of ADTs is that the data type interface is provided in the high level language, while the language implementation and underlying hardware and software conceal the ADT implementation details
One of the key design issues in programming languages is determining the level of support the language supplies for user-defined abstract data types.
Object-oriented languages are a natural extension of the ADT concept.
Formally, an abstract data type satisfies the following conditions:
The program units which use objects of the defined type are called clients of the type
ADTs provide all the benefits discussed earlier for abstractions:
ADTs also provide increased reliability by enforcing the use of defined operations to manipulate the data types: programmers are not free to access the data types directly, they can only be used in accordance with the access processes put in place by the ADT designer
Note that in most languages the programmer has the ability to simulate the use of ADTs, but not the ability to enforce them.
In C, for instance, the programmer may create a data type and a set of access routines which - if used - give the desired behavior. However, it is very difficult to prevent programmers from circumventing the access routines if they so desire.
Thus, under our formal definition, C has poor support for ADTs.
In particular this must address the visibility of both the implementation details for the ADT's operations and the underlying data types used to implement the current ADT.
Operations to create and destroy instances are likely candidates (carrying out appropriate memory allocation and deallocation, as well as any appropriate initialization?)
Tests for equality/inequality are also reasonable possibilities.
C++ supports abstract data types through its class construct.
A class is defined with associated operations, or member functions, and data fields, or data members.
Once a class is defined and a name assigned to the class type, variables can be declared to be instances of that type.
For example, if a stack
class is defined, then variables
firststack
and secondstack
might be declared as
two separate stack instances.
In implementation terms:
(e.g. the code for function push
is not duplicated for each
individual stack)
stack
gets its own set of data fields)
new
operator is used to explicitly create a class
instance
during execution then the created instance is heap-dynamic, as with
any other data type
(e.g. a variable has been declared as a pointer to a class instance,
and during execution new
is used to construct the class instance
and
assign its address to the pointer)
(Of course, the data members within the class could contain variables which are used for heap-dynamic allocation also)
Example: a declaration for an array-based C++ stack of integers:
class stack { public: stack(); // constructor: initializes stack ~stack(); // destructor: cleans terminated stack void push(int n); // push n onto stack void pop(); // remove top item from stack int top(); // return value of top item on stack int isempty(); // return 1 if stack empty, 0 otherwise int isfull(); // return 1 if stack full, 0 otherwise int getsize(); // return current number of elements on stack private: int stackarray[MAXSIZE]; // stack contents int elements; // number of items currently on stack }; stack::stack() // body of the stack constructor { elements = 0; for (int i = 0; i < MAXSIZE; i++) stackarray[i] = 0; } stack::~stack() // body of the stack destructor { // no real cleanup necessary } void stack::push(int n) // body of the stack push routine { if (isfull() == 0) { stackarray[elements++] = n; } } // etc with the rest of the stack member functions // Declaring and using stacks void main() { stack s1, s2; // s1 and s2 are instances of stacks s1.push(1); // push value 1 onto stack 1 cout << s1.top() << endl; // print the top value of stack 1 s2.push(s1.top()); // push (a copy of) the top value // from stack 1 onto stack 2 // etc ... }Note the following declaration and usage features for C++ classes:
(also note that neither have return types, and neither uses the return statement)
For the stack example, suppose we replaced all the member functions with inline functions:
class stack { public: stack() { elements = 0; for (int i = 0; i < MAXSIZE; i++) stackarray[i] = 0; } ~stack() { } void push(int n) { if (isfull() == 0) stackarray[elements++] = n; } void pop() { if (isempty() == 0) elements--; } int top() { if (isempty() == 0) return(stackarray[elements-1]); } int isempty() { if (elements == 0) return(1); else return(0); } int isfull() { if (elements == MAXSIZE) return(1); else return(0); } int getsize() { return(elements); } private: int stackarray[MAXSIZE]; // stack contents int elements; // number of items currently on stack };
In C++ this is achieved by the use of friend functions.
Suppose we have a client function that can be implemented much more
effectively
given access to a class' private data, e.g. a peek
function for
the stacks considered above.
Within the declaration of our stack class, we must explicitly list
peek
as a friend function:
class stack { friend int peek(int d); // ... and the rest of the usual stack declaration }; // ... int peek(int d) // peek at the d'th element from the top of the stack { // body of peek function }
And, since garbage collection is carried out implicitly in Java, there are no destructors
A package can contain multiple related classes, and these classes can access variables and member functions from one another as long as they have public or protected modifiers, or no modifier at all.
Thus this has many of the same benefits of the friend functions of C++ (which are not supported in Java)
Parameterized ADTs in C++
Earlier we discussed generic functions in C++, using template
functions to handle a variety of possible data types for parameters.
Templates are also useful in creating parameterized ADTs - for example stacks where a generic data type is used for the types of elements which can be pushed on the stack.
Here we modify the inline version of our stack as follows:
template <class Type> class stack { public: stack() { elements = 0; stackarray_ptr = new Type [MAXSIZE]; for (int i = 0; i < MAXSIZE; i++) stackarray_ptr[i] = 0; } ~stack() { delete stackarray_ptr; } void push(int n) { if (isfull() == 0) stackarray_ptr[elements++] = n; } void pop() { if (isempty() == 0) elements--; } int top() { if (isempty() == 0) return(stackarray[elements-1]); } int isempty() { if (elements == 0) return(1); else return(0); } int isfull() { if (elements == MAXSIZE) return(1); else return(0); } int getsize() { return(elements); } private: Type *stackarray_ptr; // stack contents int elements; // number of items currently on stack };
This is ideally suited to modelling, simulating, or controlling real-world systems which can be regarded as sets of communicating entities, each with its own internal processes that are invoked as a result of communications with the outside world.
The central point to object-oriented programming languages is the ability to take abstractions of abstract data types:
the commonality of similar ADTs is extracted and used as a base type, which the variants can build upon through the concept of inheritance
True object oriented languages are (in a formal definition) required to support three key features:
ADTs were discussed in the previous section, so we now focus on the concepts of inheritance and dynamic binding.
Inheritance is the process by which one class can be created as a special category of another class - inheriting (although possibly redefining) the variables and methods of the original class.
One of the key practical goals of including inheritance in a language is to improve the rate of software reuse - if a "reasonable" set of underlying classes is defined then much of the work of creating specialized instances is reduced.
TERMINOLOGY:
However, some languages support the concepts of
The goal here is to allow parent classes to use variables of (limited?) generic data types, and define methods which act on those variables.
These generic variables should be able to reference any of the subclasses, and the methods may be overridden or customized by those subclasses to handle the different data types appropriately.
When the (generic) variable calls the (overridden) method the call is dynamically bound to the proper method in the proper class.
This facilitates long term development and maintenance of software systems, where all the possible (specific) data types may not be known at the time of initial development.
For instance, in a pure OO system, ALL data types would be treated as classes - from bits and Booleans through floats and strings and all user defined types.
In such a system there should be no distinction between predefined types and user defined types - they are all classes and all are handled through messages.
This would be the purest method, but loses some of the efficiency one could obtain through allowing hardware manipulation of the most common, simplest data types and operations
Unfortunately, treating the simple scalar data types differently than objects is likely to lead to problems when one begins mixing the use of objects and non-objects (we will return to this discussion when we consider wrapper classes in Java)
Suppose the parent class defines a LIST
type, and the
derived class defines a SORTEDLIST
type.
When evaluating type compatibility, if x
is a LIST
and y
is a SORTEDLIST
, under what conditions should
they be considered type-compatible?
One possibility is to rule that a derived class is only a subtype (i.e. type compatible) if it only adds variables and methods and overrides inherited methods in "compatible" ways.
I.e. the overriding method can only replace the overridden method in a manner which has no possibility of generating type errors
(e.g. for absolute safety: the overriding method might be required to have identical numbers, types, and orders of parameters, and an identical return type)
These two versions of inheritance are referred to as (naturally) interface inheritance and implementation inheritance
Implementation inheritance makes the subclass somewhat dependent on the implementation of the parent (i.e. changes to the parent implementation directly impact the subclass).
On the other hand, interface inheritance can cause a loss in efficiency since the subclass cannot directly access the data variables in the same manner as the parent did - it must work through the publicly-available interface.
Single inheritance is much simpler to work with in terms of program maintainance and readability - when multiple inheritance is permitted it is possible to create much more complex (and confusing) dependencies between classes.
On the other hand, multiple inheritance allows for more flexible use (and re-use?) of existing classes and combinations thereof.
One of the practical issues with multiple inheritance is that of name
collisions:
suppose classes A and B each have a field named Initial
and class
C inherits from both A and B - how should the conflict between the two names
be handled?
Should they only be allocated by the compiler (e.g. stack-dynamic), or only during execution using commands like new (e.g. heap-dynamic), or should both be allowed?
If objects can be heap-dynamic, then is garbage collection explicit or implicit?
We specified earlier that when a generic variable in a parent class references a method which is inherited (and possibly overridden) by the subclasses, the binding of the call to the correct subclass method takes place dynamically.
Given that, how do we carry out type checking?
If the language is intended to be strongly typed then type checking needs to be carried out statically, and this significantly restricts the ways in which polymorphic messages and methods can be used.
We need to check the actual vs the formal parameters for the method used, and the actual vs the expected return type.
The nature of the coercions carried out by the language determines our flexibility in the matching of protocols between the methods defined in the parent class and the overrides applied in the derived classes.
(The template classes of C++ provide considerable flexibility while still retaining reasonable type checking compared to pure OO languages such as Smalltalk)
If so, efficiency is improved since the cost of dynamic binding is typically much higher.
(This is supported in C++, where the use of virtual keyword distinguishes the possible need for dynamic binding - and in fact is has been demonstrated that even with dynamic binding in C++ only five more memory references are required than with static binding)
new
operator,
and the lack of garbage collection necessitates the use of the
delete
operator for deallocation of heap-allocated objects.
class List { public: List(); ~List(); ... private: Link *head; // Link is class for list elements int length; } List::List() { head = NULL; length = 0; } List::~List() { Link *p = head; while (p != NULL) { Link *pnext = p->next; delete p; p = pnext; } } // where destructor might get called ... List *l; ... l = new List; ... delete l; ...
class SubClassName: public ParentClassName { // derived class body }; class SubClassName: private ParentClassName { // derived class body };Private vs public derivations determine whether the public/protected methods of the parent class will be passed on to any classes which are subsequently derived from the new subclass.
For instance, if C is a subclass of B which is a subclass of A, and we use
class B: private A { };
then although B has access to the public/protected methods of A, C will not have such access.
(Note: methods that were private in A are inherited by B, but not visible in B!)
class A { private: int x; protected: int y; public: int z; }; class B: private A { // x is not visible // can access y using A::y // can access z using A::z }; class C: private B { // cannot access A::x, A::y, or A::z };
Name conflicts are resolved by (the programmer)
specifying the name of the parent,
e.g. if both X and Y contain a foo
function,
we refer to them via X::foo()
and Y::foo()
A pure virtual function is defined by the "= 0;" syntax shown below,
such functions have no body and cannot be called - they must be redefined in the derived classes, as shown below.
Any class that contains a pure virtual function (such as the shape class below) is said to be an abstract class, and no object of such a class can be created.
(Note that in the example below a reference to a shape is declared, but the object it eventually references is created from one of the non-abstract classes)
class shape { public: virtual void draw() = 0; // generic draw function ... } class circle: public shape { public: virtual void draw() { ...draw a circle... } ... } class rectangle: public shape { public: virtual void draw() { ...draw a rectangle... } ... } class square: public rectangle { public: virtual void draw() { ...draw a square... } ... } square s; // a square shape rectangle r; // a rectangle shape shape &ref_shape = s; // a reference to shape s ref_shape.draw(); // takes ref_shape, a pointer to a "general" shape, // and dynamically binds to the draw method for // a square r.draw(); // can statically bind this call, since r is known // to be a rectangle at compile time
Note that this format allows us to easily extend our collection of shapes.
Suppose we wish to add a triangle
shape,
and all the functions within the base shape
class
are declared as virtuals.
Then we need only do the following:
triangle
, from shape
draw
, etc) and the code necessary to construct
triangle
objects
NONE OF THE PREVIOUSLY EXISTING CODE NEEDS TO BE ALTERED
Once a function is declared as virtual, it is treated as virtual in all the subsequently derived classes
Person / \ Student Employee \ / Teaching assistant class Person: { ... } class Student: virtual public Person { ... } class Employee: virtual public Person { ... } class TeachingAssisant: public Student, public Employee { ... }
The only difference between a C++ struct and a C++ class is that all the members of a struct are by default public.
Again, Java and C++ are very similar in terms of support for OO, but there are some differences:
(As a note, since some classes will only operate on objects, it is occasionally necessary to create a wrapper class for the simple types!)
Object
, and ALL other
classes must
be a derivative of Object
or one of its subclasses (i.e. no
stand-alone
classes other than Object
interface
definitions: specifying the named
constants
and method declarations for a class, but nothing else.
This allows for a sort of virtual class
final
, in which case they cannot be overridden and
are
statically bound.
(Essentially the reverse of the C++ case, where methods are statically bound unless they are virtual functions.)
packages
, which can encapsulate a number
of
related
classes and allow access to protected data between these classes,
This achieves many of the goals of the C++ friend functions, while providing a clearer ADT-based relationship for the sharing of such data.
template <class Type> class Stack { public: Stack(int MaxSize = 100); // create stack with size limit, ~Stack() { delete stackptr; } // delete stack space void push(Type& data); // push data element on stack void pop(){ if (size > 0) size--; } // delete top element Type top(); // return copy of top stack element bool isempty(){ return(size == 0); } // is stack empty? bool isfull(){ return(size == maxsize); } // is stack full? int getsize(){ return(size); } // get current stack size int getmaxsize(){ return(maxsize); } // get maximum stack size Type peek(int depth); // peek into stack at depth from top private: int size; // current stack size int maxsize; // maximum stack size Type *stackptr; // ptr for array of elements }; template <class Type> Stack::Stack(int MaxSize) { maxsize = MaxSize; stackptr = new Type[MaxSize]; size = 0; } template <class Type> void Stack::push(Type& data) { if (size < maxsize) stackptr[size++] = data; } template <class Type> Type Stack::peek(int depth) { if ((size > depth) && (depth >= 0)) return(stackptr[size-(depth +1)]); else throw OutOfBounds(); // throw exception } template <class Type> Type Stack::top() { if (size > 0) return(stackptr[size-1]); else throw OutOfBounds(); // throw exception }
operator
keyword.
<return type> operator <symbol> (<parameter_list>);
The C++ restrictions on operator overloading are as follows:
Furthermore, the following operators cannot be overloaded
Some operators can only be overloaded as operations, not as
functions. These operators are = [] -> ()