CSCI 160: Fall 2000, Sections F0001-F0004, Pointers

CSCI 160: Fall 2000: Pointers

A number of times over the semester we've made reference to the location of variables in memory during the execution of a program.

C++ includes a powerful mechanism for finding out where in memory a variable is stored, and thereafter accessing the memory location without further reference to the variable name.

This section deals with the use of pointers and addresses for variables.

Every variable has associated with it a number of key features:

the variable's data type,
the variable's name,
the location (or address) in memory where the variable is stored,
and the variable's current value

The first type and name are specified by the developer when the variable is declared, the address is determined "behind the scenes" when the program is compiled and loaded, and the current value of course changes during program execution.

We can find out where in memory a variable is stored using the address operator, &

#include <cstdio>
using namespace std;

int main()
{
   int i = 3; 
   float f = 0.0;

   printf("Variable i has value %d,\n", i);
   printf("   takes up %d bytes of memory\n", sizeof(i));
   printf("   and is stored at address %p\n", &i);

   printf("Variable f has value %g,\n", f);
   printf("   takes up %d bytes of memory,\n", sizeof(f));
   printf("   and is stored at address %p\n", &f);
}

On a sample run, this produced output

Variable i has value 3,
   takes up 4 bytes of memory,
   and is stored at address 0x11fffbff0
Variable f has value 0,
   takes up 4 bytes of memory,
   and is stored at address 0x11fffbff4

The prefix 0x indicates the number is being displayed in hexadecimal notation rather than decimal.

In hexadecimal (or just hex) 0..9 represent the normal digits and a through f represent values 10 through 15:

 hex   decimal     hex    decimal        hex   decimal
   0 == 0           10      16           a0     160
   ...              11      17           a1     161
   9 == 9            ...
   a == 10          1a      26
   b == 11          1b      27
   c == 12          1c      28          etc
   d == 13          1d      29
   e == 14          1e      30
   f == 15          1f      31           ff    255

Pointers: a way of storing addresses

We will find ways to use the memory addresses of variables, but to do so we need to be able to remember the address.

A pointer variable is a variable that can be used to store the address of another variable.

To create a pointer variable, we specify a data type, the variable name, and the * symbol:

int x;     // x is a regular integer variable

int *ptr;  // ptr can store the pointer to an integer variable,
           // i.e. in ptr we can store the addresses of integer variables

ptr = &x;  // store the address of variable x in the ptr variable

Note that the pointer must "point at" the right kind of data - i.e. we can set up variables to point at integers, but these are different than variables that will point at floats, or at chars;

int i;
float f;

int   *i_ptr;  // i_ptr can point at integers
float *f_ptr;  // f_ptr can point at floats

i_ptr = &i;    // now i_ptr holds the memory address of variable i
f_ptr = &f;    // now f_ptr holds the memory address of variable f

Using pointers to access and change variable contents

If we have a pointer to a variable, it means we know the location of the variable in memory.

If we know the location of a variable in memory, then we can "go to" that location and change its contents -- thus also changing the variable value.

If ptr contains a pointer to another variable (say variable x), then *ptr describes the variable contents

int x;               // integer variable x
int *ptr;            // ptr can hold pointers to integer variables

ptr = &x;            // ptr now points to x
*ptr = 3;            // store 3 in the variable ptr points at,
                     //  in this case is equivalent to x = 3

printf("%d\n", x);   // prints out the value of x, i.e. 3

Changing the pointer variable changes what it points at, e.g.

int x, y, z;
int *ptr;

ptr = &x;  // ptr points to x
*ptr = 1;  // x = 1

ptr = &y;  // ptr points to y
*ptr = 2;  // y = 2

ptr = &z;  // ptr points to z
*ptr = 3;  // z = 3

Array names as pointers

"Behind the scenes" an array name can be treated as a pointer to the first element in the array:

int numbers[5];  // create an array of 5 integers
numbers[0] = 33;

int *ptr1, *ptr2;  // create two pointers to integers

// set both ptr1 and ptr2 to point to the "first" integer in the array
ptr1 = numbers;   
ptr2 = &numbers[0];

// print out the first integer in the array, 
//   in a variety of different ways
// (each prints out 33)
printf("%d\n", *ptr1);
printf("%d\n", ptr1[0]);
printf("%d\n", *ptr2);
printf("%d\n", ptr2[0]);
printf("%d\n", *numbers);
printf("%d\n", numbers[0]);

In fact, with arrays, we can access other array elements by using offsets from the pointers, e.g. *(ptr+0) is the first array element, *(ptr+1) is the next array element, *(ptr+2) is the next, etc etc:

#include <cstdio>
using namespace std;

int main()
{
   int numbers[5] = { 0, 1, 2, 3, 4 };
   int *ptr;

   ptr = numbers;

   for (int i = 0; i < 5; i++) {
       printf("Element %d", i);
       printf(" has address %p", &numbers[i]);
       printf(" and contents %d\n", *(ptr+i));
   }
}

This produces output

Element 0 has address 0x11fffbfd0 and contents 0
Element 1 has address 0x11fffbfd4 and contents 1
Element 2 has address 0x11fffbfd8 and contents 2
Element 3 has address 0x11fffbfdc and contents 3
Element 4 has address 0x11fffbfe0 and contents 4

This also works with other types of arrays, and is very commonly applied when working with strings (character arrays).

Suppose we have two strings, stringone and stringtwo, and we want to skip the first 10 characters of stringone and copy the rest into stringtwo:

strcpy(stringtwo, (stringone+10) );

stringone is essentially a pointer to the first character in the string, so stringone+10 is like a pointer to the position 10 characters into the string.

#include <cstdio>
using namespace std;

const int STRSIZE = 20;
typedef char string[STRSIZE];

int main()
{
   string mystr;

   strcpy(mystr, "This is the string");
   for (int i = 0; i < strlen(mystr); i++) {
       printf("%s\n", (mystr+i));
   }
}

Produces the following results:

This is the string
his is the string
is is the string
s is the string
 is the string
is the string
s the string
 the string
the string
he string
e string
 string
string
tring
ring
ing
ng
g

Differences between array names and pointers

OK, above we've shown how the name of an array and a pointer variable can be treated as almost the same thing in many circumstances, but there are some important differences.

Consider the following code segment

int main() {

   int x;      // x is an int
   int *x1;    // x1 is a ptr to an int
   int x2[5];  // x2 is an array of 5 ints
   int **x3;   // x3 is a ptr to a ptr to an int

   x1 = new int[5]; // legal, x1 pts to the first of five ints

   x1 = x2;         // legal, x1 pts to the first int of x2
   x2 = x1;         // not legal, x2 acts like a constant pointer,
                    //            we cannot change x2 to point at x1

   x1 = &(x2[0]);   // legal, x1 pts to the first int of x2
   x1 = &x2;        // not legal, x1 cannot be a ptr to an array

   x1 = &x;         // legal, x1 pts to x
   x3 = &x1;        // legal, x3 pts to x1
   x3 = &x2;        // not legal, x3 has type int**
                    //        but x2 has type int (*)[5]
}

Passing addresses to functions

We can achieve the same effect as reference parameters by using pointers:

we take the address of a variable,
we pass the address as the parameter to a function,
the function uses the address as a pointer through which it can alter the variable contents

#include <cstdio>
using namespace std;

void swap(int *x, int *y);

int main()
{
   int var1 = 3;
   int var2 = 17;
   swap(&var1, &var2);
   printf("%d\n", var1); // var1 is now 17
   printf("%d\n", var2); // var2 is now 3
}

void swap(int *x, int *y)
{
   int tmp;
   tmp = *x;  // take the value x points to, 
              //    and store it in tmp
   *x = *y;   // take the value y points to, 
              //    and store it wherever x points to
   *y = tmp;  // take the value in tmp,
              //    and store it wherever y points to
}

In C (the predecessor to C++) there were no reference parameters, so programmers in C had to use pointers in this fashion if they wanted a function to change a variable's value.

Dynamically creating arrays using pointers

One of the concerns we had with arrays earlier was that we have to know the size of the array at compile time - our array declarations had to be of the form

DataType arrayname[SIZE];

where SIZE was some sort of constant expression.

Now that we have pointers, we will be able to use a technique to dynamically create arrays: i.e. create arrays of specific sizes we determine during program execution.

The technique is to use the new operator, and store a pointer to the resulting array:

#include <cstdio>
using namespace std;

int main()
{
   int arrsize;   // variable to hold the user-chosen array size
   float *arr;    // pointer that will ultimately point to the
                  //   beginning of the array

   // get the array size as input from the user
   do {
      printf("Please enter a positive array size\n");
      printf("%d", &arrsize);
   } while (arrsize < 1);

   // now dynamically create an array of the right size
   arr = new float[arrsize];

   // play with the array contents (initialize them, then print them out)

   for (int i = 0; i < arrsize; i++) {
       arr[i] = i*0.5;
   }

   for (int i = 0; i < arrsize; i++) {
       printf("%g\n", arr[i]);
   }
}

If we entered an array size of 6, the result would be:

Please enter a positive array size
0
0.5
1
1.5
2
2.5

We can use a similar technique to dynamically create structs, use the new operator and store a pointer to the dynamically created struct:

#include <cstdio>
using namespace std;

struct DataItem {
    int number1;
    float number2;
} ;

int main()
{
   DataItem *dptr;  // dptr is a pointer to a DataItem struct

   dptr = new DataItem; // dynamically create a new DataItem,
                        //   and store the pointer to it in dptr

   // access to the struct fields can now be achieved through
   //    the -> operator, instead of the usual . operator
   dptr->number1 = 3;
   dptr->number2 = 26.7;

   printf("%d\n", dptr->number1);
   printf("%g\n", dptr->number2);
}

This should output

3
26.7

Note that with pointers to structs we use the -> operator to access the struct fields instead of the . operator, but otherwise access is the same as before.

NOTE ON MEMORY USE:

When your program explicitly uses new to dynamically create new variables, the storage space used does not disappear when the particular function ends ... the storage space isn't released until either

your program ends, or
you explicitly deallocate the storage using delete
```
delete dptr;
```

Dynamic data structures - data structures that grow and shrink over time

Even though we can now dynamically decide how big an array should be when we create it, what happens if we later change our mind?

For example, we decide to add more items later, or delete some items later ... we don't have the ability to change the array size once it exists.

A possible solution is to create a new array of the desired new size, and copy the needed data from the old version to the new one - but if we frequently change array sizes this would be slow and inefficient.

Solution: Dynamic Data Structures

What we need are data structures that we can add-to and delete-from as time goes on, possibly adding or removing a single piece of data at a time as needed.

These are extremely common structures in practical programming - we'll deal with them extensively in CSCI 161 and later courses.

Here we'll give a short introduction and example (a variation of this is the problem for assignment 6...)

The data structure we will consider is a linked list - we will store our collection of data as a list of single data items: each element of the list will have some data associated with it, and will also have a pointer to the next data item in the list.

When we want to add new items to the list, we:

decide where in the list the new item should go
dynamically allocate the new list item
adjust the pointers within the old list so that the new item is included

            Original pair of items in a list
+------------------+      +------------------+
| data for item 1  |      | data for item 2  |
+------------------+      +------------------+
| ptr to next item ======>| ptr to next item =====> (currently no more items)
+------------------+      +------------------+

            Revised list with new item added at the end
+------------------+  +------------------+  +------------------+
| data for item 1  |  | data for item 2  |  | data for item 3  |
+------------------+  +------------------+  +------------------+
| ptr to next item ==>| ptr to next item ==>| ptr to next item ==> (no more)
+------------------+  +------------------+  +------------------+

            Alternatively, revised list with new item added in middle 
+------------------+  +------------------+  +------------------+
| data for item 1  |  | data for item 3  |  | data for item 2  |
+------------------+  +------------------+  +------------------+
| ptr to next item ==>| ptr to next item ==>| ptr to next item ==> (no more)
+------------------+  +------------------+  +------------------+

Common linked-list structure

Linked lists are an extremely common data type, so here we'll consider a typical layout.

First, we need a struct data type that describes the individual list items - this will have to include a field to store the data and a field to store the pointer to the next data item:

struct ListItem {
   int data;
   ListItem *nextptr;
} ;

We might also create a separate data structure to represent the list as a whole - maybe something that keeps track of where the list begins, and how big the list is:

struct LinkedList {
   int listsize;
   ListItem *head;
} ;

By convention, the first element in a linked list is usually referred to as the head of the list, and the last element in a linked list is usually referred to as the tail of the list.

Initializing our list

In the beginning, we need a linked list variable, and the list starts out empty:

LinkedList mylist;
mylist.head = NULL;  // NULL means head doesn't point at anything yet
mylist.listsize = 0; // the list is currently empty

Creating new elements to add to our list

Before we can add any data to our list, we need to dynamically allocate a new data item, and store the data in that:

ListItem *ptr;  // temporary pointer, used over and over for
                // creating new data items

ptr = new ListItem; // creates new ListItem, and stores its address in ptr

ptr->data = 3;  // whatever the new data is, store it in the list item
                // note the use of -> instead of .
                //     this is done because ptr is a pointer to a struct,
                //     not a struct itself

Inserting the first element of our list

To insert the element in our list, we first need to decide where in the list we will insert it: at the head? at the tail? somewhere in the middle?

Let's assume for now that we'll always insert the element at the head of the list;

We need to set the head of the list to point to the new element, and also to make sure that the "next" field of the new element points to the old head of the list (to ensure we don't lose the rest of our list)

The order we do this in is important:

Set the "next" field of our new element to point to the old head of the list
Set the head of the list to now point at our new element

We should also remember to update the size of our list

ptr->nextptr = mylist.head;
mylist.head = ptr;
mylist.listsize++;

Deleting list elements

If we chose to delete an element from the list and free up its memory space, the order is again important.

Suppose we want to delete the front element of a list:

ptr = mylist.head;            // get a temporary ptr to the current head
mylist.head = ptr->nextptr;   // advance the head of the list to 
                              //    the next item in line
ptr->nextptr = NULL;          // eliminate dangling pointers
delete ptr;                   // free memory space

A simple linked-list example

#include <cstdio>
using namespace std;

// WordStrings will be 20-character strings
const int WordSize = 20;
typedef char WordString[WordSize];

// A linked list will consist of a series of ListItems,
//   each item needs to have stored data (a word)
//   and a pointer to the next item in the list
struct ListItem {
   WordString word;
   ListItem *next;
} ;

// A list will have a pointer to the first item in the list (head)
//   a pointer to the last item in the list (tail),
//   and a counter recording how many items are currently in the list (size)
struct LinkedList {
   ListItem *head;
   ListItem *tail;
   int       size;
} ;

void Getword(WordString w);
// gets a non-null string from the user

void InsertAtHead(LinkedList& L, WordString w);
// creates a new list item, containing the string w,
// and inserts it at the head of linked list L

void PrintList(LinkedList L);
// print the contents of the list, in order from head to tail

int main()
{
   LinkedList list;  // a linked list of words

   WordString newword; // temporary storage for newly-entered words

   char cmd;  // temporary storage for user-entered command selections

   bool quit = false; // flag to indicate exit condition

   // initialize the linked list to empty
   list.head = NULL;
   list.tail = NULL;
   list.size = 0;

   printf("Welcome!\n");
   do {
      // determine if the user wants to quit or 
      //    add new items to the linked list

      printf("Enter A to add a word or Q to quit\n");
      printf("%c", &cmd);

      if (cmd == 'A') {
         // if the user chooses to add new items to the list,
         //    get the word to add,
         //    insert it at the head of the list,
         //    and print out the revised list contents
         Getword(newword);
         InsertAtHead(list, newword);
         PrintList(list);

      } else if (cmd == 'Q') {
         quit = true;

      } else {
         printf("Invalid command, %c\n", cmd);
      }

   } while (!quit);
   printf("Bye!\n");
}

void Getword(WordString w)
// gets a non-null string from the user
{
   do {
      printf("Please enter a word\n");
      scanf("%20s", w);
   } while (strlen(w) == 0);
}

void InsertAtHead(LinkedList& L, WordString w)
// creates a new list item, containing the string w,
// and inserts it at the head of linked list L
{
   ListItem *itemptr;

   // create a new list item
   itemptr = new ListItem;
   // store the data word in the new list item
   strcpy(itemptr->word, w);

   // we want the new item inserted at the head of the list,
   // so the "next" field in the new item must point to the
   // "old" head of the list
   itemptr->next = L.head;
   // then update the overall head of the list to point to the
   // new list item
   L.head = itemptr;

   // don't forget to update the size of the list (which has increased by 1)
   L.size++;

   // if this is the first item added to the list
   //    then it is also the "tail" of the list, 
   //    so update the list's tail pointer
   if (L.size == 1) L.tail = itemptr;
}

void PrintList(LinkedList L)
// print the contents of the list, in order from head to tail
{
   ListItem *itemptr;
   itemptr = L.head;

   printf("The words in the list are:");

   // when the next list item is null it's time to quit,
   //   otherwise print the next list item,
   //   then set pointer to refer to the following list item
   while (itemptr != NULL) {
      printf(" %s", itemptr->word);
      itemptr = itemptr->next;
   }
   printf("\n");
}