Basic Computing Concepts and Programming in C: Unit 8

8 Pointers

In most of the previous program examples, arrays are declared with fixed lengths. However, in many applications, the size of an array may not be known during program writing time. A simple solution to this problem is to set an array large enough for most cases. For example, in program stack_array.h, the stack is set to an array of 100 elements. It is large enough for the practice purpose. However, in a real application, the stack size may be too small for practical problem. Another solution is to set the stack as dynamic data so that the size of stack can be adjusted as needed. In this chapter, we will study dynamic data allocation and its language support in C programming language.

8.1 Dynamic Memory Schemes

In most programming languages, program data are stored in computer memory in three methods: static allocation, stack dynamic allocation, and dynamic allocation. Memory allocation of a variable data is determined by its declaration. We explain how these three memory allocation schemes are supported in C.

Static allocation: if a variable is declared as static or as a global variable, its memory space is allocated at the beginning of program execution and stays statically through the entire program execution time.
Stack dynamic allocation: if a variable is declared as a non-static local variable of a function, its memory space is allocated when the function is called and deallocated when the execution of the function is completed.
Dynamic allocation: if a variable is declared as a pointer variable, the memory space of the variable data is allocated and deallocated explicitly during program execution.

The memory space of the three memory allocation schemes are stored in different location. For static allocation, the memory space is a chunk of memory called static memory. For stack dynamic allocation, the memory space is in the run-time stack. For dynamic allocation, the memory space is a chuck of memory called heap memory.

8.2 Pointers in C

A pointer is a memory location whose content is another memory location. In C programming language, a pointer is declared using "*" symbol. For example, the following program lines declare variable a, b, and c, as character pointer, integer pointer, and floating-number pointer, respectively.

char *a;

int *b;

float *c;

The contents of memory locations bound to variables a, b, and c do not contain data values, but they are address of another memory location. To allocate memory space for variable values, library stdlib.h of C programming language provides memory functions to do the job. Program malloc_int.c shows function call malloc(sizeof(int)) that allocates the memory space of an integer and returns its location to be the content of pointer variable a. The actual parameter of malloc (Line 7) is sizeof(int) which requests four bytes of memory space.

#include <stdio.h>

#include <stdlib.h>

int main(void) {
int *a;

a = (int *) malloc(sizeof(int));

*a = 10;

printf("Address of a: 0X%X\n", &a);

printf("Content of a: 0X%X\n", a);

printf("Value of *a: %d\n", *a);

return 0;
}

Lines 10 to 12 print the address of a, the content of a, and the value of *a. The address of a and the content of a have locations that are not at contiguous memory space because the first one is stack dynamic and the second one is in the heap memory.

Address of a: 0X22FF7C
Content of a: 0X3D3800
Value of *a: 10

A pointer variable is not only pointing to the location of a data element. It is used to point to (the first data element of) an array. Program malloc_array1.c is an example of memory allocation of an array of three elements. Function call malloc(sizeof(int) * 3) allocates 12 bytes of memory space and returns the location of the first bytes.

#include <stdio.h>

#include <stdlib.h>

int main(void) {
int *a, i;

a = (int *) malloc(sizeof(int) * 3);

for (i=0; i<3; i++) a[i] = 10+i;

printf("Address of a: 0X%X\n", &a);

printf("Content of a: 0X%X\n", a);

printf("Address of &a[0]: 0X%X\n", &a[0]);

printf("Address of &a[1]: 0X%X\n", &a[1]);

printf("Address of &a[2]: 0X%X\n", &a[2]);

printf("Value of *a: %d\n", *a);

printf("Value of a[0]: %d\n", a[0]);

printf("Value of a[1]: %d\n", a[1]);

printf("Value of a[2]: %d\n", a[2]);

return 0;
}

Lines 12 to 14 shows the addresses of the three array elements and each element occupies four bytes. Lines 16 to 18 shows the program can access the memory location using an array.

Address of a: 0X22FF7C
Content of a: 0X3D3828
Address of &a[0]: 0X3D3828
Address of &a[1]: 0X3D382C
Address of &a[2]: 0X3D3830
Value of *a: 10
Value of a[0]: 10
Value of a[1]: 11
Value of a[2]: 12

Similarly, pointer variables can be used to represent multi-dimensional arrays of dynamic size. Program malloc_array2.c declares variable a as a pointer of pointer of an integer which corresponds a two dimensional array, int **a. In Line 7, function call malloc(sizeof(int*) * 3) allocates three integer pointers and returns the location of the first pointer to variable a. In Line 8, each iteration of the for-loop allocates an array of four elements, i.e., 12 bytes for each dimension of array a.

#include <stdio.h>

#include <stdlib.h>

int main(void) {
int **a, i, j;

a = (int **) malloc(sizeof(int*) * 3);

for (i=0; i<3; i++) a[i] = (int *) malloc(sizeof(int) * 4);

for (i=0; i<3; i++)

for (j=0; j<4; j++) a[i][j] = 100+10*i+j;

printf("Address of a: 0X%X\n", &a);

printf("Content of a: 0X%X\n", a);

for (i=0; i<3; i++)

printf("Address of &a[%d]: 0X%X\n", i, &a[i]);

for (i=0; i<3; i++)

for (j=0; j<4; j++)

      printf("Address of &a[%d][%d]: 0X%X\n", i, j, &a[i][j]);

for (i=0; i<3; i++)
    for (j=0; j<4; j++)
      printf("Value of a[%d][%d]: %d\n", i, j, a[i][j]);

return 0;
}

The addresses of the three pointers of the first dimension of array a are allocated at contiguous memory space, 0X3D24D0, 0X3D24D4, and 0X3D24D8. The data elements of array a[0] occupies 16 contiguous bytes, so do the data elements of arrays a[1] and a[2]. However, the first starting address of array a[1] does not immediately follow the last data element array a[0].

Address of a: 0X22FF74
Content of a: 0X3D24D0
Address of &a[0]: 0X3D24D0
Address of &a[1]: 0X3D24D4
Address of &a[2]: 0X3D24D8
Address of &a[0][0]: 0X3D24E8
Address of &a[0][1]: 0X3D24EC
Address of &a[0][2]: 0X3D24F0
Address of &a[0][3]: 0X3D24F4
Address of &a[1][0]: 0X3D2500
Address of &a[1][1]: 0X3D2504
Address of &a[1][2]: 0X3D2508
Address of &a[1][3]: 0X3D250C
Address of &a[2][0]: 0X3D2518
Address of &a[2][1]: 0X3D251C
Address of &a[2][2]: 0X3D2520
Address of &a[2][3]: 0X3D2524
Value of a[0][0]: 100
Value of a[0][1]: 101
Value of a[0][2]: 102
Value of a[0][3]: 103
Value of a[1][0]: 110
Value of a[1][1]: 111
Value of a[1][2]: 112
Value of a[1][3]: 113
Value of a[2][0]: 120
Value of a[2][1]: 121
Value of a[2][2]: 122
Value of a[2][3]: 123

Note that the address of a one-dimensional integer array can be declared as an integer int *a and int a[]. Both declarations are equivalent.

8.3 String Operations

Recall that a string is an array of characters. If the maximum length of character strings in an application is known, the programmer may declared character arrays of maximum length to represent strings in the application program. However, a program may cause errors when the length of a string may grow arbitrarily. The following program example string_overflow.c will cause incorrect content of string A if the length of string B is greater than or equal to 8.

#include <stdio.h>

#include <stdlib.h>

int main(void) {

char A[8], B[8];

printf("Enter string A: ");
scanf("%s", A);
printf("Enter string B: ");
scanf("%s", B);

printf("\nAddress of A: %X\n", &A);
printf("Address of B: %X\n\n", &B);
printf("String A: %s\n", A);
printf("String B: %s\n", B);

return 0;
}

Strings A and B are declared as arrays of eight characters. The location of string A is located right after the location of string of B. If the input data of B is more than seven (8 - 1) characters, it will occupy the memory space of string A and destroys the content of string A.

Enter string A: abcde
Enter string B: 123456789012345

Address of A: 22FF78
Address of B: 22FF70

String A: 9012345
String B: 123456789012345

In fact, the string problem of overflow is not avoidable if function scanf() is uses. To solve this problem, function getchar() must be used to read characters of a string one by one and dynamically allocate the memory space of a string variable.

C program language provides a set of string operation library routines, <string.h>. Some frequently used string operations are listed below:

int strlen(const char *str);	Returns the length of string str.
int strcmp(const char str1`, const char `str2`);`	Compares two strings str1 and str2. Returns 0 if str1 and str2 are same strings; returns a negative number, if str1 is lexical graphically less than str2; returns a positive number, if str1 is lexical graphically greater than str2.
char strcpy(char* str1, const* char *str2);	Copies string str2 to str1.
char strcat(char* str1, const* char *str2);	Appends the string pointed to by str2 to the end of the string pointed to by str1. The terminating null character of str1 is overwritten. Copying stops once the terminating null character of str2 is copied. If overlapping occurs, the result is undefined.
char strpbrk(const* char str1, const* char *str2);	Finds the first character in string str1 that matches any character specified in str2. A pointer to the location of this character is returned. A NULL pointer is returned if no character in str2 exists in str1.
int strspn(const char str1, const* char *str2);	Finds the first sequence of characters in string str1 that contains any character specified in str2. Returns the length of this first sequence of characters found that match with str2.
char strstr(const* char str1, const* char *str2);	Finds the first occurrence of the entire string str2 (not including the terminating null character) which appears in the string str1. Returns a pointer to the first occurrence of str2 in str1. If no match was found, then a null pointer is returned. If str2 points to a string of zero length, then the argument str1 is returned.

The other strings functions are referred to <string.h>.

The lexical graphical order in string comparison operation, strcmp(), is like the order of vocabularies in a dictionary. For example, in program string_compare.c, string "book" is listed before string "desk" in a dictionary. That is, "book" is lexically graphic less than "desk". Hence, strcmp("book", "desk") returns value -1. In Line 8, "abc" and "abc" are identical strings such that strcmp("abc", "abc") returns 0.

#include <stdio.h>
#include <string.h>

int main(void) {

printf("strcmp(\"book\", \"desk\") returns %d\n", strcmp("book", "desk"));
printf("strcmp(\"abc\", \"abc\") returns %d\n", strcmp("abc", "abc"));
printf("strcmp(\"xyz\", \"XYZ\") returns %d\n", strcmp("xyz", "XYZ"));
printf("strcmp(\"abc\", \"XYZ\") returns %d\n", strcmp("abc", "XYZ"));

return 0;
}

Output of program string_compare.c is as the following.

strcmp("book", "desk") returns -1
strcmp("abc", "abc") returns 0
strcmp("xyz", "XYZ") returns 1

strcmp("abc", "XYZ") returns 1

8.4 Linked Lists

A linked list is a collection of nodes that are connected linearly. A node consists of some data fields and one or more links connected to other nodes. If the nodes of a linked list has only one link, the link list is called single-linked list. The following figure shows a single-linked list with the entry point first.

In programming languages, a linked list is defined as a recursive data type, also called self-referential data type. That is, a nodes of a linked list is declared as a structure of data fields and a pointer to a node. The declaration of a linked list of C programming language is shown in stack_list.h. The linked list is used in this example to implement stack operations. In Lines 1 to 4, a node is declared as struct Node with two fields: an integer data field, int elem, and a self-referential pointer struct Node* next. Data type nodePtr is decalred as a pointer of a node (Line 6) and so is stack (Line 7). Lines 9 to 15 are the function heads of stack operations.

struct Node {
int elem;
struct Node* next;
};

typedef struct Node* nodePtr;
typedef nodePtr stack;

void createStack(stack*);
int isEmpty(stack*);
void push(stack*, int);
int pop(stack*);
int top(stack*);
void printStack(stack*);

The implementation of the stack data type with a linked list, stack_list.c uses dynamic memory allocation to allocate and deallocate nodes. All functions accept a pointer to stack as (one of) their parameter(s). A variable of stack type always points to the node denoting the top of a stack. Fuction createStack() simply set the node pointer of stack to the NULL pointer, i.e., set the stack to be empty. Function isEmpty() tests if the stack is empty or not. For a linked list data structure, manipulation of node pointers must be handled carefully, according to the way an abstract data type is implemented. Function push() first allocates the memory of a node (Line 16). This node will be the top of stack s after the push operation (Line 19). The data pushed to stack is stored in the elem filed of the top node (Line 17) and the original top node is stored in the next field of the new top node (Line 18). Function pop() will removes the top node it stack s is not empty. It first stores the current top node to a temporary node pointer. The elem field of the top node is stored (Line 28) and returned (Line 31). The next filed becomes the new top node of stack s (Line 29). When a node is popped from, its memory space is released so it can be reused again (Line 30). Function top() simple returns the elem field, if stack s is not empty (Lines 39 to 45). Function printStack() prints all data elements from the top node to the bottom node.

#include <stdio.h>

#include <stdlib.h>

#include "stack_list.h"

void createStack(stack *s) {

*s = NULL;

}

int isEmpty(stack s) {

return (s == NULL);

}

void push(stack *s, int e)

nodePtr temp;

temp = (nodePtr) malloc(sizeof(struct Node));

temp->elem = e;

temp->next = *s;

*s = temp;

}

int pop(stack *s) {

nodePtr temp;

int e;

if (*s != NULL) {

temp = *s;

e = temp->elem;

*s = temp->next;

free(temp);

return e;

}

else {

printf("Stack underflow.\n");

return -9999;

}

int top (stack s) {

if (s != NULL) return s->elem;

else {

printf("Stack empty.\n");

return -9999;

}

void printStack(stack s) {

nodePtr temp = s;

while (temp != NULL) {

printf("%4d", temp->elem);

temp = temp->next;

}

The stack abstract data type is used in the following application program stack_list_reverse.c. The program uses two stack variables s and t which are implemented as singly linked lists. Lines 9 and 10 initialize the two stacks to the empty stack. The number of elements count is input by the user. Then, a number of integers, between 0 and 99 are generated randomly and pushed to stack s (Line 14). The elements of s are then output in Line 16. Line 19 fills the elements of stack t which is in the reversed order of s. However, during the operation operation, all elements of stack s are removed from s. The elements of stack t are output in Line 21.

#include <stdio.h>
#include <stdlib.h>
#include "stack_list.h"

int main(void) {
stack s, t;
int count, i;

createStack(&s);
createStack(&t);

printf("Enter the number of elements: ");
scanf("%d", &count);
for (i = 0; i < count; i++) push(&s, rand() % 100);
printf("\nPrint stack s (from top to bottom):\n");
printStack(&s);
printf("\n");

while (!isEmpty(&s)) push(&t, pop(&s));
printf("\nPrint stack t (from top to bottom):\n");
printStack(&t);
printf("\n");
　

return 0;
}

The following is an example run of stack_list_reverse.c:

Enter the number of elements: 25

Print stack s (from top to bottom):
92 53   2   4 91 36 27 42 95 91 61   2   7 81 45   5 64 62 58 78 24
69   0 34 67 41

Print stack t (from top to bottom):
41 67 34   0 69 24 78 58 62 64   5 45 81 27 61 91 95 42 27 36 91

4 2 53 92

Suppose a sorted list is declared as below:

struct Node {
int elem;
struct Node* next;
};

typedef struct Node* nodePtr;
typedef nodePtr list;

In many other applications of sorted single-linked list, three node operations search, insertion, and deletion are shown as below:

To search a node of a given value key to list L. The while loop continues when element key is not found and it is not that case that a node with data value greater than key or the end of list L is not reached. In the body of the loop, if the data value elem equals to key, then key is found (Line 6); if the data value of node current is greater than key, set current to NULL (Line 7); otherwise, move the next node (Line 8). Finally, the location of current or NULL is returned (Line 10)

nodePtr search_list(list L, int key) {
nodePtr current=L;
int found=0;

while (!found && current!=NULL) {
    if (current->elem==key) found = 1;
    else if (current->elem>key) current = NULL;
    else current = current->next;
}
return current;
}

To insert a node of a given value key to list L. To insert a node, we first check whether the inserted node becomes the first node of list or not, i.e., in the case the current list is empty or the first node has data value greater than or equal to key (Line 5). If the inserted node becomes the first node list, then create a newnode, store value key to the elem field of newnode and location of the current node to the next filed of newnode, and finally set newnode as the first node the list (Lines 5 to 8). Otherwise, keep search for the proper node where the newnode of value key will be placed (Lines 11 to 12). When the while loop exits, node newnode is created and is inserted to the the node following node current (Lines 14 to 17). The insertion operation is to update two pointers as in Lines 16 and 17.

void insert_list(list* LPtr, int key) {
nodePtr current=*LPtr, newnode;

if (current==NULL || current->elem>=key) {
    newnode = (nodePtr) malloc(sizeof(struct Node));
    newnode->elem = key;
    newnode->next = current;
    *LPtr = newnode;
}
else {
    while (current->next!=NULL && current->next->elem<key) {
      current = current->next;
    }
    newnode = (nodePtr) malloc(sizeof(struct Node));
    newnode->elem = key;
    newnode->next = current->next;
    current->next = newnode;
}
}

To delete a node of a given value key from list L. The deletion operation fails and returns false, i.e., 0, if the list is empty or the data value of the first node is greater than key (Lines 5 to 7). If the data value of the first node equals to key, the first node is deleted and the second node becomes the first node list (Lines 8 to 12). Otherwise, we search for a node of data value key for the deletion operation. In the search operation node current and its previous node are traced (Lines 14 and 15). During the search process, if the data value of node current is greater than key or it reaches the end of list, the deletion operation fails and returns false (Line 17). If the data value of current equals to key, node current is deleted. The deletion is simply store the location of the next field of node current to the next field of its previous node and the memory space of node current is released (Lines 18 to 22). In Lines 23 and 24, nodes current and previous are updated for the next search step.

int delete_list(list* LPtr, int key) {
nodePtr current=*LPtr, previous;
int cont = 1;

if (current==NULL || current->elem>key) {
return 0;
}
else if (current->elem==key) {
*LPtr = current->next;

    free(current);
    return 1;
}
else {
    previous = current;
    current = current->next;
    while (1) {
      if (current==NULL || current->elem>key) return 0;
      else if (current->elem==key) {
        previous->next = current->next;
        free(current);
        return 1;
      }
      else {
        previous = current;
        current = current->next;
      }
    }
}
}

A double-linked list has two links of each node. A node consists of some data fields and two links prev pointing to the node prior to it and next pointing to node following it. Usually, a doubly-linked list has two entry pointers head and tail. The head node points to the first node of the linked list, i.e., the node with no next node. The tail node points to the last node of the linked list, i.e., the node with no previous node.

The following program segment is the header of a double-linked list. The nodes of a doubly-linked list can be declared as struct Node which has an integer data field elem and two self-referential pointers prev and next. Data type nodePtr is a node pointer. Suppose a double-linked list is used to implement a queue. We can define a structure data type Queue with two fields head and tail. When a queue is created both head and tail are set to NULL. When the first node is enqueued to a queue, both head and tail must be set to the location of that node. When the last node is dequeued from a queue, both head and tail must be set to NULL.

struct Node {
int elem;
struct Node* prev;
struct Node* next;
};

typedef struct Node* nodePtr;
typedef struct queue {
nodePtr head;
nodePtr tail;
} Queue;

Similarly, a circularly double-linked list has two links of each node. A node consists of some data fields and two links prev pointing to the node prior to it and next pointing to node following it. Usually, a doubly linked list has one entry pointers frist denoting the first node of the list. The prev pointer will point to the "last" node of the list.

The following program segment is the header of a circularly linked list. It almost identical to doubly linked list. Only one entry pointer first is declared and set to NULL.

struct Node {
int elem;
struct Node* prev;
struct Node* next;
};

typedef struct Node* nodePtr;

nodePtr first=NULL;

Table of Contents

Previous: 7. Subprograms

Next: 9. Input/Output with Files