Table of Contents

Previous: 8. Pointers

Next: A. Program Links


9  Input/Output with Files

In all of the previous programming examples, data are input from a keyboard and output to a display monitor. However, in most computer applications, data are input/output from/to various of devices. Among many  frequently used devices are secondary storage devices such as magnetic tapes, floppy disks, hard disks, and flash memory. Data are stored in secondary storage devices in the file format. We will discuss external memory and file organization in this chapter and then explain file operations in C programming language.

9.1  External Memory Storage

The most frequently used secondary storage device is hard disks. A hard disk consists of a number of platters where data of 0's and 1's are stored in the platters and accessed through read/write heads.

A platter is divided into a number of concentric rings, called tracks The data are stored in successive tracks on the surface of the disk. Each track is divided into a number of sectors which are the smallest unit of of a disk. When a disk operation is performed to access a byte, all bytes of the sector containing that byte are read.

The tracks of all platters form a number cylinders of a disk. Data on a single cylinder can be accessed without moving read/write heads. The move of read/write heads is the slowest motion of disk operations.

A file is stored in a disk as a number of records, a 256-byte block. Suppose a disk is formatted as two records on a sector, 40 sectors per track, 11 tracks per cylinder, and 1331 cylinders. The disk has the capacity 300 mega bytes approximately. If a file has 20,000 records,  then it is stored in 20,000/(2´40´11)»22.7 cylinders. If the disk does not have 22.7 contiguous cylinders available, the file may be spread out over a number, even hundreds, of cylinders.

9.2 File Organization

The file system developed by MS-DOS and used in Microsoft Windows up to Windows Me is call File Allocation Table (FAT). The file system used in Windows NT, Windows 7, and Windows 8 is called New Technology File System (NTFS). Files are stored in a disk with an organization of the tree structure, called directory tree. All the internal nodes are directories and all the leave nodes are files. The following figure is an example of Microsoft Windows File Manager which shows part of a directory tree. The left-hand-side frame is a tree structure of directories and the right-hand-side frame contains the leave nodes of path "E:\課程\計算機概論\Book\".

A file stored in a disk can be accessed randomly. Major file operations are opening files, closing file, seeking file location, reading file data, writing data, and checking end-of-file.  Usually, a pointer is used to mark the location in a file that will be processed. We will discuss how these file operations are supported in C programming language.

9.3 File Operations in C

The data communicated between a program and an input/output device is facilitated as an abstraction of stream so file operations can be commonly used as sending/receiving data to/from various devices. There are two types of data streams: text and binary. A file containing all characters is a text stream, e.g., a C source program, and a file containing a sequence of bits are binary stream, e.g., a C programming language object code. When a C  program starts, there exist three data streams standard input, standard output, and standard error. These data streams are handled by functions such as printf(), scanf(), and perror().

For data streams, C programming language uses a data type FILE to mark the stream of the file being processed and provides the following functions to perform various operations:

FILE *fopen(char *fname, char *mode);

Opens file named fname with specified mode. The modes are:

r: read text mode,

w: write text mode (existing data of the file will be erased; a new file will be created if the named file does not exist),

a: append text mode for writing (appending text data to the end of the file; a new file will be created if the named file does not exist),

rb: read binary mode,

wb: write binary mode (existing data of the file will be erased; a new file will be created if the named file does not exist),

ab: append binary mode for writing (appending binary data to the end of the file; a new file will be created if the named file does not exist),

r+: read and write text mode,

w+: read and write text mode (existing data of the file will be erased; a new file will be created if the named file does not exist),

a+: read and write text mode (appending text data to the end of the file; a new file will be created if the named file does not exist),

 

rb+: read and write binary mode,

wb+: read and write binary mode (existing data of the file will be erased; a new file will be created if the named file does not exist),

ab+: read and write text mode (appending text data to the end of the file; a new file will be created if the named file does not exist).

If a file is opened with read and write mode (+), an input and an output must be intervening with fseek, fsetpos, rewind, or fflush. If a file is opened successfully, a file pointer is returned; otherwise, a NULL pointer is returned.

int fclose(FILE *fptr);

Closes stream pointed by fptr. If the stream is closed successfully, returns 0; otherwise, returns EOF (end of file).

size_t fread(char *sptr, size_t size, int nmemb, FILE *fptr);

Reads data of size´nmemb bytes and stores it in the string buffer pointed by sptr from stream fptr.

size_t fwrite(char *sptr, size_t size, size_t nmemb, FILE *fptr);

Writes data of size´nmemb bytes from the string buffer pointed by sptr to stream fptr.

int feof(FILE *fptr);

Tests the end of file of stream fptr. Returns a non-zero value, if fptr is at the end of file; otherwise, returns 0.

int fgetc(FILE *fptr);

Gets the character from stream fptr and moves the stream pointer forward one position. The character is returned by the function if fgetc succeeds. EOF is returned if the end of file is reached. The error indicator for the file stream is set and EOF is returned, if an error occurs.

int fputc(int c, FILE *fptr);

Writes the character c to the file stream fptr and moves the stream pointer forward one position. The character is returned by the function if fputc succeeds. The error indicator for the file stream is set and EOF is returned, if an error occurs.

int fseek(FILE *fptr, long int offset, int whence);

Sets the file position of stream fptr to the given offset. The parameter offset indicates the number of bytes to seek from the given whence position. The parameter whence can be:

SEEK_SET: seeks from the beginning of the file,

SEEK_CUR: seeks from the current position of the file,

SEEK_END: seeks from the end of the file.

void rewind(FILE *fptr);

Sets the file pointer to the beginning of stream fptr.

long int ftell(FILE *fptr);

Returns the position of stream fptr in turns of the number of bytes from the beginning of the file for a binary stream and the position used in fseek for a text stream.

int fflush(FILE *fptr);

Flushes the output buffer of stream fptr. If fptr is NULL, then all output buffers are flushed. If succeeds, returns 0; otherwise returns EOF.

int rename(const char *old_fname, const char *new_fname);

Changes the name of a file from the old file name old_fname to the new file name new_fname. If rename succeeds, zero is returned; otherwise, an error code is returned.

int remove(char *fname);

Delete a file named fname. If remove succeeds, zero is returned; otherwise, a non-zero value is returned.

Other file operations are referred to http://www.acm.uiuc.edu/webmonkeys/book/c_guide/2.12.html#streams. Recall that function fopen() must be called first when a program is going to process a file and function fclose() must be called when a file is no more used in a program. Program file_copy1.c shows the use of file operations to copy file source.txt to result.txt:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

#include <stdio.h>
#include <stdlib.h>

int main(void) {
  char *buffer;
  FILE *dataIn, *dataOut;
  int fLeng;

  dataIn = fopen("source.txt", "r");
  if (dataIn==NULL) {
    printf("\nText file \"source.txt\" does not exist.\n\n");
    return -1;
  }

  fseek(dataIn, 0, SEEK_END);
  fLeng = ftell(dataIn);
  fseek(dataIn, 0, SEEK_SET);
  buffer = (char *) malloc((fLeng + 1) * sizeof(char));
  fread(buffer, 1, fLeng, dataIn);
  fclose(dataIn);
  buffer[fLeng] = '\0';

  printf("%s", buffer);

  dataOut = fopen("result.txt", "w");
  fwrite(buffer, 1, fLeng, dataOut);
  fclose(dataOut);

  return 0;
}

Variable buffer is a pointer of a character array which is used to store data read from the input file source.txt (Line 5). Two FILE pointers are declared for the input file and the output file (Line 6).  Variable fLeng is the length of the input file (Line 7). In Line 9,  source.txt is opened as an input file with mode "r". Lines 10 to 13 check failure of the open file operation. If source.txt does not exist, the open file operation fails an it returns the NULL pointer. If the file is opened successfully, the file pointer is returned and assigned to variable dataIn. To read the input file, we need to determine the length of the file. In Line 15, the statement move the file pointer to the end of file by calling fssek() with parameter SEEK_END. Then ftell is called to obtain the position of the end of the file (Line 16). This position is exactly the length of the file. Note that the file format may vary depending on the operating system. A new line is counted as two characters: a carriage return '\X0D', and a line feed '\X0A',  for a Microsoft Windows text file. However, these two characters are counted as a single character for a Unix text file. The input file source.txt has been converted to Unix text file. An editor such as Ultra Editor can be used to convert a text file from the Microsoft Windows format to the Unix format and vice versa. In Line 18, the memory space of a character array of size fLeng+1 is allocated and assigned to bufferIn. We allocate one byte more than the file length for the end of string character '\X00'. The fLeng bytes of data in the file pointed by dataIn is read and stored in array bufferIn as shown in Line 19. After data being read, the file is closed in Line 20. The end of string character is inserted in Line 21. Lines 23 simply prints the entire text string on the standard output device. In Line 25, the output file result.txt is opened with mode "w". The write operation fwrite() outputs the data in bufferIn with fLeng characters to the output file pointed by dataOut (Line 26). Finally, the output file is closed (Line 27). The output of file_copy1.c is shown as below:

Computing Concepts and Programming in C/C++
Department of Information Engineering and Computer Science
Feng Chia University

Another version of file copy is shown in program file_copy2.c. The difference between file_copy1.c and file_copy2.c is the way data are input and output. In file_copy2.c, while loops are used to read and write a character in each iteration using fgetc() and fputc() (Lines 20 and 28).

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

#include <stdio.h>
#include <stdlib.h>

int main(void) {
  char *buffer;
  FILE *dataIn, *dataOut;
  int fLeng;

  dataIn = fopen("source.txt", "r");
  if (dataIn==NULL) {
    printf("\nText file \"source.txt\" does not exist.\n\n");
    return -1;
  }

  fseek(dataIn, 0, SEEK_END);
  fLeng = ftell(dataIn);
  fseek(dataIn, 0, SEEK_SET);
  buffer = (char *) malloc((fLeng + 1) * sizeof(char));
  i = 0;
  while (!feof(dataIn)) buffer[i++] = fgetc(dataIn);
  fclose(dataIn);
  buffer[fLeng] = '\0';

  printf("%s", buffer);

  dataOut = fopen("result.txt", "w");
  i = 0;
  while (buffer[i]!='\0') fputc(buffer[i++], dataOut);
  fclose(dataOut);

  return 0;
}

C programming language also provides formatted input and output functions for file operations.

int fprintf(FILE *fptr, char *format, ...);

Writes data specified by format parameters to file stream pointed by fptr.

int fscanf(FILE *fptr, char *format, ...);

Reads data specified by format parameters from file stream pointed by fptr.

Functions fprintf() and fscanf() are similar to standard input/output functions printf() and scanf(), except they require an additional parameter of file stream pointer fptr. The format specifiers are like those in printf() and scanf(). Finally, we show program average_class_sort.c which reads student's score from file student_score.txt:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

typedef struct {
  char  id[10];
  char  name[20];
  int   score[6];
  float average;
} student;

int readStudentScore(student *students) {
  FILE *fptr;
  int count = 0, i;

  fptr = fopen("student_score.txt", "r");
  while (feof(fptr) == 0) {
    fscanf(fptr, "%s", &students[count].id);
    fscanf(fptr, "%s", &students[count].name);
    for (i = 0; i < 6; i++) {
      fscanf(fptr, "%d", &students[count].score[i]);
    }
    count++;
  }
  fclose(fptr);
  return count;
}

void printHead (char **course) {
  int i;

  printf("\n學號 姓名 ");
  for (i = 0; i < 6; i++) {
    printf("%s ", course[i]);
  }
  printf(" 平均\n\n");
}

void printStudent(student *stu, char **course, int *creditHour) {
  int i, totalScore = 0, totalCreditHour = 0;

  printf("%s ", stu->id);
  printf("%s", stu->name);
  if (strlen(stu->name) == 4) printf("     ");
  else if (strlen(stu->name) == 6) printf("   ");
  else if (strlen(stu->name) == 8) printf(" ");
  for (i = 0; i < 6; i++) {
    printf("%3d ", stu->score[i]);
    totalScore += stu->score[i] * creditHour[i];
    totalCreditHour += creditHour[i];
  }
  stu->average = (float) totalScore / totalCreditHour;

  printf(" %5.2f\n", stu->average);
}

void printAverage(student *students, int count) {
  int totalScore, i, j;
  float totalAverage;

  printf("\n 平均 ");
  for (i = 0; i < 6; i++) {
    totalScore = 0;
    for (j = 0; j < count; j++) {
      totalScore += students[j].score[i];
    }
    printf("%5.2f ", (float) totalScore / count);
  }
  totalAverage = 0;
  for (j = 0; j < count; j++) {
    totalAverage += students[j].average;
  }
  printf("%5.2f\n\n", (float) totalAverage / count);
}

void copyStudent(student *stu1, student *stu2) {
  int i;

  strcpy(stu1->id, stu2->id);
  strcpy(stu1->name, stu2->name);
  for (i = 0; i < 6; i++) stu1->score[i] = stu2->score[i];
}

void cs(student *stu1, student *stu2, int key) {
  student temp;
  int swap = 0, i;
  int total1 = 0, total2 = 0;

  if (key == 0) {
    if (strcmp(stu1->id, stu2->id) > 0) swap = 1;
  }
  else if (key > 0 && key < 7) {
    if (stu1->score[key - 1] > stu2->score[key - 1]) swap = 1;
  }
  else {
    for (i = 0; i < 6; i++) total1 += stu1->score[i];
    for (i = 0; i < 6; i++) total2 += stu2->score[i];
    if (total1 > total2) swap = 1;
  }

  if (swap) {
    copyStudent(&temp, stu1);
    copyStudent(stu1, stu2);
    copyStudent(stu2, &temp);
  }
}

void sortStudents(student *stu, int cnt, int key) {
  int i, j;

  for (i = cnt; i > 0; i--)
    for (j = 1; j < i; j++) cs(&stu[j-1], &stu[j], key);
}

int main(void) {

  student students[20];
  char *course[6] = {"計概    ",
                     "計概實習",
                     "微積分  ",
                     "線代    ",
                     "普物    ",
                     "普物實驗"};
  int creditHour[6] = {3, 1, 4, 3, 3, 1};
  int totalStudent, i, key;

  totalStudent = readStudentScore(students);

  do {
    printf("\n 0: 學號\n");
    for (i = 1; i <= 6; i++) printf(" %d: %s\n", i, course[i - 1]);
    printf(" 7: 平均\n\n");
    printf("Enter a sorting key (0 to 7): ");
    scanf("%d", &key);
  } while (key < 0 || key > 7);

  sortStudents(students, totalStudent, key);

  printHead(course);
  for (i = 0; i < totalStudent; i++) {
    printStudent(&students[i], course, creditHour);
  }
  printAverage(students, totalStudent);

  return 0;
}


Table of Contents

Previous: 8. Pointers

Next: A. Program Links