Sunday, May 12

File Handling in C Language

In this section, we will discuss about files which are very important for storing information permanently. We store information in files for many purposes, like data processing by our programs.

What is a File?

Abstractly, a file is a collection of bytes stored on a secondary storage device, which is generally a disk of some kind. The collection of bytes may be interpreted, for example, as characters, words, lines, paragraphs and pages from a textual document; fields and records belonging to a database; or pixels from a graphical image. The meaning attached to a particular file is determined entirely by the data structures and operations used by a program to process the file. It is conceivable (and it sometimes happens) that a graphics file will be read and displayed by a program designed to process textual data. The result is that no meaningful output occurs (probably) and this is to be expected. A file is simply a machine decipherable storage media where programs and data are stored for machine usage.
Essentially there are two kinds of files that programmers deal with text files and binary files. These two classes of files will be discussed in the following sections.

ASCII Text files

A text file can be a stream of characters that a computer can process sequentially. It is not only processed sequentially but only in forward direction. For this reason a text file is usually opened for only one kind of operation (reading, writing, or appending) at any given time.
Similarly, since text files only process characters, they can only read or write data one character at a time. (In C Programming Language, Functions are provided that deal with lines of text, but these still essentially process data one character at a time.) A text stream in C is a special kind of file. Depending on the requirements of the operating system, newline characters may be converted to or from carriage-return/linefeed combinations depending on whether data is being written to, or read from, the file. Other character conversions may also occur to satisfy the storage requirements of the operating system. These translations occur transparently and they occur because the programmer has signalled the intention to process a text file.

Binary files

A binary file is no different to a text file. It is a collection of bytes. In C Programming Language a byte and a character are equivalent. Hence a binary file is also referred to as a character stream, but there are two essential differences.
  1. No special processing of the data occurs and each byte of data is transferred to or from the disk unprocessed.
  2. C Programming Language places no constructs on the file, and it may be read from, or written to, in any manner chosen by the programmer.
Binary files can be either processed sequentially or, depending on the needs of the application, they can be processed using random access techniques. In C Programming Language, processing a file using random access techniques involves moving the current file position to an appropriate place in the file before reading or writing data. This indicates a second characteristic of binary files.
They a generally processed using read and write operations simultaneously.
For example, a database file will be created and processed as a binary file. A record update operation will involve locating the appropriate record, reading the record into memory, modifying it in some way, and finally writing the record back to disk at its appropriate location in the file. These kinds of operations are common to many binary files, but are rarely found in applications that process text files.

Creating a file and output some data

In order to create files we have to learn about File I/O i.e. how to write data into a file and how to read data from a file. We will start this section with an example of writing data to a file. We begin as before with the include statement for stdio.h, then define some variables for use in the example including a rather strange looking new type.
/* Program to create a file and write some data the file */
#include <stdio.h>
#include <stdio.h>
main( )
{
     FILE *fp;
     char stuff[25];
     int index;
     fp = fopen("TENLINES.TXT","w"); /* open for writing */
     strcpy(stuff,"This is an example line.");
     for (index = 1; index <= 10; index++)
     	fprintf(fp,"%s Line number %d\n", stuff, index);
     fclose(fp); /* close the file before ending program */
}
The type FILE is used for a file variable and is defined in the stdio.h file. It is used to define a file pointer for use in file operations. Before we can write to a file, we must open it. What this really means is that we must tell the system that we want to write to a file and what the file name is. We do this with the fopen() function illustrated in the first line of the program. The file pointer, fp in our case, points to the file and two arguments are required in the parentheses, the file name first, followed by the file type.
The file name is any valid DOS file name, and can be expressed in upper or lower case letters, or even mixed if you so desire. It is enclosed in double quotes. For this example we have chosen the name TENLINES.TXT. This file should not exist on your disk at this time. If you have a file with this name, you should change its name or move it because when we execute this program, its contents will be erased. If you don’t have a file by this name, that is good because we will create one and put some data into it. You are permitted to include a directory with the file name.The directory must, of course, be a valid directory otherwise an error will occur. Also, because of the way C handles literal strings, the directory separation character ‘\’ must be written twice. For example, if the file is to be stored in the \PROJECTS sub directory then the file name should be entered as “\\PROJECTS\\TENLINES.TXT”. The second parameter is the file attribute and can be any of three letters, r, w, or a, and must be lower case.

Reading (r)

When an r is used, the file is opened for reading, a w is used to indicate a file to be used for writing, and an a indicates that you desire to append additional data to the data already in an existing file. Most C compilers have other file attributes available; check your Reference Manual for details. Using the r indicates that the file is assumed to be a text file. Opening a file for reading requires that the file already exist. If it does not exist, the file pointer will be set to NULL and can be checked by the program.
Here is a small program that reads a file and display its contents on screen.
/* Program to display the contents of a file on screen */
#include <stdio.h>
void main()
{
   FILE *fopen(), *fp;
   int c;
   fp = fopen("prog.c","r");
   c = getc(fp) ;
   while (c!= EOF)
   {
   		putchar(c);
		c = getc(fp);
   }
   fclose(fp);
}

Writing (w)

When a file is opened for writing, it will be created if it does not already exist and it will be reset if it does, resulting in the deletion of any data already there. Using the w indicates that the file is assumed to be a text file.
Here is the program to create a file and write some data into the file.
#include <stdio.h>
int main()
{
 FILE *fp;
 file = fopen("file.txt","w");
 /*Create a file and add text*/
 fprintf(fp,"%s","This is just an example :)"); /*writes data to the file*/
 fclose(fp); /*done!*/
 return 0;
}

Appending (a)

When a file is opened for appending, it will be created if it does not already exist and it will be initially empty. If it does exist, the data input point will be positioned at the end of the present data so that any new data will be added to any data that already exists in the file. Using the a indicates that the file is assumed to be a text file.
Here is a program that will add text to a file which already exists and there is some text in the file.
#include <stdio.h>
int main()
{
    FILE *fp
    file = fopen("file.txt","a");
    fprintf(fp,"%s","This is just an example :)"); /*append some text*/
    fclose(fp);
    return 0;
}

Outputting to the file

The job of actually outputting to the file is nearly identical to the outputting we have already done to the standard output device. The only real differences are the new function names and the addition of the file pointer as one of the function arguments. In the example program, fprintf replaces our familiar printf function name, and the file pointer defined earlier is the first argument within the parentheses. The remainder of the statement looks like, and in fact is identical to, the printf statement.

Closing a file

To close a file you simply use the function fclose with the file pointer in the parentheses. Actually, in this simple program, it is not necessary to close the file because the system will close all open files before returning to DOS, but it is good programming practice for you to close all files in spite of the fact that they will be closed automatically, because that would act as a reminder to you of what files are open at the end of each program.
You can open a file for writing, close it, and reopen it for reading, then close it, and open it again for appending, etc. Each time you open it, you could use the same file pointer, or you could use a different one. The file pointer is simply a tool that you use to point to a file and you decide what file it will point to. Compile and run this program. When you run it, you will not get any output to the monitor because it doesn’t generate any. After running it, look at your directory for a file named TENLINES.TXT and type it; that is where your output will be. Compare the output with that specified in the program; they should agree! Do not erase the file named TENLINES.TXT yet; we will use it in
some of the other examples in this section.
Reading from a text file
Now for our first program that reads from a file. This program begins with the familiar include, some data definitions, and the file opening statement which should require no explanation except for the fact that an r is used here because we want to read it.
#include <stdio.h>
   main( )
   {
     FILE *fp;
     char c;
     funny = fopen("TENLINES.TXT", "r");
     if (fp == NULL)
		printf("File doesn't exist\n");
     else {
      do {
       c = getc(fp); /* get one character from the file
       */
         putchar(c); /* display it on the monitor
       */
       } while (c != EOF); /* repeat until EOF (end of file)
     */
     }
    fclose(fp);
   }
In this program we check to see that the file exists, and if it does, we execute the main body of the program. If it doesn’t, we print a message and quit. If the file does not exist, the system will set the pointer equal to NULL which we can test. The main body of the program is one do while loop in which a single character is read from the file and output to the monitor until an EOF (end of file) is detected from the input file. The file is then closed and the program is terminated. At this point, we have the potential for one of the most common and most perplexing problems of programming in C. The variable returned from the getc function is a character, so we can use a char variable for this purpose. There is a problem that could develop here if we happened to use an unsigned char however, because C usually returns a minus one for an EOF – which an unsigned char type variable is not
capable of containing. An unsigned char type variable can only have the values of zero to 255, so it will return a 255 for a minus one in C. This is a very frustrating problem to try to find. The program can never find the EOF and will therefore never terminate the loop. This is easy to prevent: always have a char or int type variable for use in returning an EOF. There is another problem with this program but we will worry about it when we get to the next program and solve it with the one following that.
After you compile and run this program and are satisfied with the results, it would be a good exercise to change the name of TENLINES.TXT and run the program again to see that the NULL test actually works as stated. Be sure to change the name back because we are still not finished with TENLINES.TXT.

File Handling

In C++ we say data flows as streams into and out of programs. There are different kinds of streams of data flow for input and output. Each stream is associated with a class, which contains member functions and definitions for dealing with that particular kind of flow. For example, the if stream class represents the input disc files,. Thus each file in C++ is an object of a particular stream class.

The stream class hierarchy

The stream classes are arranged in a rather complex hierarchy. You do not need to understand this hierarchy in detail to program basic file I/O, but a brief overview may be helpful. We have already made extensive use of some of these classes. The extraction operator >> is a member of istream class and the insertion operator is a member of ostream class. Both of these classes are derived from the ios class. The cout object is a predefined object of the ostream with assign class. It is in turn derived from ostream class. The classes used for input and output to the video display and keyboard are declared in the header file IOSTREAM.H, which we have routinely included in all our programs.

Stream classes

The ios class is the base class for the entire I/O hierarchy. It contains many constants and member functions common to input and output operations of all kinds. The istream and ostream classes are derived from ios and are dedicated to input and output respectively Their member functions perform both formatted and unformatted operations. The iostream class is derived from both istream and ostream by multiple inheritance, so that other classes can inherit both of these classes from it. The classes in which we are most interested for file I/O are ifstream for input files ofsteam for output files and fstream for files that will be used for both input and output the ifstream and ofsteam classes are declared in the FSTREAM.H. file.
The isteam class contains input functions such as
  • getline( )
  • getine( )
  • read( )
and overloaded extraction operators.
The ostream class contains functions such as
  • Put( )
  • write( )
and overloaded insertor.

Writing strings into a file

Let us now consider a program which writes strings in a file.
//program for writing a string in a file
#include<fstream.h>
void main( )
{
ofstream outfile("fl.fil");//create a file for output
outfile<<"harmlessness, truthfulness, calm"<<endl;
outfile<<"renunciation, absence of wrath and fault-finding"<<endl;
outfile<<"compassion for all, non-covetousness, gentleness, modesty"<<endl;
outfile<<"stability. vigour, forgiveness, endurance, cleanliness"<<endl;
outfile<<"absence of malice and of excessive self-esteem"<<endl;
outfile<<"these are the qualities of godmen"<<endl;
}
In the above program, we create an object called outfile, which is a member of the output file stream class. We initialise it to the filename “fl.fil”. You can think of outfile as a user-chosen logical name which is associated with the real file on disc called “fl.fil”. When any automatic object (outfile is automatic) is defined in a function, it is created in the function and automatically destroyed when the function terminates. When our main ( ) function ends, outfile goes out of scope. This automatically calls the destructor, which closes the file. It may be noticed that we do not need to close the file explicitly by any close-file command. The insertion operator << is overloaded in ofsteam and works with objects defined from ofstream. Thus, we can use it to output txt to the file. The strings are written in the file “fl. fil? in the ASCII mode. One can see it from DOS by giving the type command. The file “fl. fil” looks as shown below
harmlessness, truthfulness, calm renunciation, absence of wrath and fault-finding compassion for all, non-covetousness, gentleness, modesty stability, vigour, forgiveness, endurance, cleanliness absence of malice and of excessive self-esteem these are the qualities of godmen.

Reading strings from file

The program below illustrates the creation of an object of ifstream class for reading purpose.
//program of reading strings
#include <fstream.h> //for file functions
void main( )
{
const int max = 80; //size of buffer
char buffer[max]; //character buffer
ifstream. infile("fl.fil")- //create file for input
while (infile) //until end-of-file
{
infile.getline(buffer,max); //read a line of text
cout<<buffer
}
}
We define infile as an ifstream object to input records from the file “fl.fil”. The insertion operator does not work here. Instead, we read the text from the file, one line at a time, using the getline( ) function. The getline ( ) function reads characters until it encounters the ? \n? character. It places the resulting string in the buffer supplied as an argument. The maximum size of the buffer is given as the second argument. The contents of each line are displayed after each line is input. Our ifstream object called infile has a value that can be tested for various error conditions -one is the end-of-file. The program checks for the EOF in the while loop so that it can stop reading after the last string.

What is a buffer?

A buffer is a temporary holding area in memory which acts as an intermediary between a program and a file or other I/0 device. Information can be transferred between a buffer and a file using large chunks of data of the size most efficiently handled by devices like disc drives. Typically, devices like discs transfer information in blocks of 512 bytes or more, while program often processes information one byte at a time. The buffer helps match these two desperate rates of information transfer. On output, a program first fills the buffer and then transfers the entire block of data to a hard disc, thus clearing the buffer for the next batch of output. C++ handles input by connecting a buffered stream to a program and to its source of input. similarly, C++ handles output by connecting a buffered stream to a program and to its output target.
Using put( ) and get( ) for writing and reading characters
The put ( ) and get( ) functions are also members of ostream and istream. These are used to output and input a single character at a time. The program shown below is intended to illustrate the use of writing one character at a time in a file.
//program for writing characters
#Include <fstream.h>
#include <string.h>
void main( ){
charstr[] = "do unto others as you would be done by ;
ofstream outfile("f2.fil");
for(int i =0; i<strlen(str); i++)
outfile put(str[i]);
}
In this program, the length of the string is found by the strlen( ) function, and the characters are output using put( function in a for loop. This file is also an ASCII file.

Reading Characters

The program shown below illustrates the reading of characters from a file.
//program for reading characters of a string
#Include <fstream.h>
void main( )
{char ch;
ifstream in file("f2.fil")?
while(infile)
infile.get(ch);
cout<<ch;
}
The program uses the get( ) and continues to read until eof is reached. Each character read from the file is displayed using cout. The contents of file f2.fil created in the last program will be displayed on the screen.

Writing an object in a files

Since C++ is an object-oriented language, it is reasonable to wonder how objects can be written to and read from the disc. The program given below is intended to write an object in a file.
//program for writing objects in files
#include<fstream.h>
class employees
{
protected:
int empno;
char name[10];
char dept[5];
char desig[5];
double basic;
double deds;
public:
void getdata(void)
{coul<<endl<<"enter empno";cin>>empno;"
cout<<endl<<"enter empname";cin>>name;
cout<<endl<<"enter department ";
cin>>dept; cout<<endl<<"enter designation ";
cin>>desig; cout<<endl<<"enter basic pay ";cin>>basic:
cout<<endl<<"enter deds ";cin>>deds;
}
void main(void){
employees emp;
emp.getdata( );
ofstream outfile("f3.fil");
outfile. write((char * )&emp,sizeof(emp));
}
This program uses a class by name employees and an object by name emp. Data can be written only by the function inside the object. This program creates a binary data file by name f3.fil.The write( ) function is used for writing. The write( ) function requires two arguments, the address of the object to be written, and the size of the object in bytes. We use the size of operator to find the length of the emp object. The address of the object must be cast to type pointer to char.

Binary vs. Character files

You might have noticed that write( ) was used to output binary values, not just characters. To clarify, let us examine this further. Our emp object contained one int data member, three string data members and two double data members. The total number of bytes occupied by the data members comes to 38. It is as if write( ) took a mirror image of the bytes of information in memory and copied them directly to disc, without bothering any intervening translation or formatting. By contrast, the character based functions take some liberties with the data. for example, they expand the? \n ?character into a carriage return and a line feed before storing it to disk.

Reading object from file

The program given below is intended to read the file created in the above program.
//program f or reading data files
#include < istream.h>
class employees
{
protected:
int empno;
char name[I0]
char dept[5];
char desig[5];
double basic;
double deds;
public:
void showdata(void)
{cout<<endl<<"employeenumber: "<<empno;
cout<<endl<<"empname "<<name;
cout<<endl<<"department "<<dept;
cout<<endl<<"designation "<<desig;
cout<<endl<<"basic pay "<<basic:
cout<<endl<<"deds "<<deds;}
void main(void){
employees empl;
ifstream infile("f3.fil");
infile.read((char*)&empl, sizeof(empl));
empl.showdata( );
}
It may be noticed that both read( )and write( functions have similar argument. we must specify the address in which the disc input will be placed. We also use size of to indicate the number of bytes to be read.
The sample output looks as shown below:
employeenumber; 123
empname venkatesa
department elec
designation prog
basic pay 567.89
deds 45.76