In this section, we will discuss about files which are very important
for storing information permanently. We store information in files for
many purposes, like data processing by our programs.
What is a File?
Abstractly, a file is a collection of bytes stored on a secondary
storage device, which is generally a disk of some kind. The collection
of bytes may be interpreted, for example, as characters, words, lines,
paragraphs and pages from a textual document; fields and records
belonging to a database; or pixels from a graphical image. The meaning
attached to a particular file is determined entirely by the data
structures and operations used by a program to process the file. It is
conceivable (and it sometimes happens) that a graphics file will be read
and displayed by a program designed to process textual data. The result
is that no meaningful output occurs (probably) and this is to be
expected. A file is simply a machine decipherable storage media where
programs and data are stored for machine usage.
Essentially there are two kinds of files that programmers deal with
text files and binary files. These two classes of files will be
discussed in the following sections.
ASCII Text files
A text file can be a stream of characters that a computer can process
sequentially. It is not only processed sequentially but only in forward
direction. For this reason a text file is usually opened for only one
kind of operation (reading, writing, or appending) at any given time.
Similarly, since text files only process characters, they can only
read or write data one character at a time. (In C Programming Language,
Functions are provided that deal with lines of text, but these still
essentially process data one character at a time.) A text stream in C is
a special kind of file. Depending on the requirements of the operating
system, newline characters may be converted to or from
carriage-return/linefeed combinations depending on whether data is being
written to, or read from, the file. Other character conversions may
also occur to satisfy the storage requirements of the operating system.
These translations occur transparently and they occur because the
programmer has signalled the intention to process a text file.
Binary files
A binary file is no different to a text file. It is a collection of
bytes. In C Programming Language a byte and a character are equivalent.
Hence a binary file is also referred to as a character stream, but there
are two essential differences.
- No special processing of the data occurs and each byte of data is transferred to or from the disk unprocessed.
- C Programming Language places no constructs on the file, and it may be read from, or written to, in any manner chosen by the programmer.
Binary files can be either processed sequentially or, depending on
the needs of the application, they can be processed using random access
techniques. In C Programming Language, processing a file using random
access techniques involves moving the current file position to an
appropriate place in the file before reading or writing data. This
indicates a second characteristic of binary files.
They a generally processed using read and write operations simultaneously.
They a generally processed using read and write operations simultaneously.
For example, a database file will be created and processed as a
binary file. A record update operation will involve locating the
appropriate record, reading the record into memory, modifying it in some
way, and finally writing the record back to disk at its appropriate
location in the file. These kinds of operations are common to many
binary files, but are rarely found in applications that process text
files.
Creating a file and output some data
In order to create files we have to learn about File I/O i.e. how to
write data into a file and how to read data from a file. We will start
this section with an example of writing data to a file. We begin as
before with the include statement for stdio.h, then define some
variables for use in the example including a rather strange looking new
type.
/* Program to create a file and write some data the file */ #include <stdio.h> #include <stdio.h> main( ) { FILE *fp; char stuff[25]; int index; fp = fopen("TENLINES.TXT","w"); /* open for writing */ strcpy(stuff,"This is an example line."); for (index = 1; index <= 10; index++) fprintf(fp,"%s Line number %d\n", stuff, index); fclose(fp); /* close the file before ending program */ }
The type FILE is used for a file variable and is defined in the
stdio.h file. It is used to define a file pointer for use in file
operations. Before we can write to a file, we must open it. What this
really means is that we must tell the system that we want to write to a
file and what the file name is. We do this with the fopen() function
illustrated in the first line of the program. The file pointer, fp in
our case, points to the file and two arguments are required in the
parentheses, the file name first, followed by the file type.
The file name is any valid DOS file name, and can be expressed in
upper or lower case letters, or even mixed if you so desire. It is
enclosed in double quotes. For this example we have chosen the name
TENLINES.TXT. This file should not exist on your disk at this time. If
you have a file with this name, you should change its name or move it
because when we execute this program, its contents will be erased. If
you don’t have a file by this name, that is good because we will
create one and put some data into it. You are permitted to include a
directory with the file name.The directory must, of course, be a valid
directory otherwise an error will occur. Also, because of the way C
handles literal strings, the directory separation character ‘\’ must
be written twice. For example, if the file is to be stored in the
\PROJECTS sub directory then the file name should be entered as
“\\PROJECTS\\TENLINES.TXT”. The second parameter is the file attribute
and can be any of three letters, r, w, or a, and must be lower case.
Reading (r)
When an r is used, the file is opened for reading, a w is used to
indicate a file to be used for writing, and an a indicates that you
desire to append additional data to the data already in an existing
file. Most C compilers have other file attributes available; check your
Reference Manual for details. Using the r indicates that the file is
assumed to be a text file. Opening a file for reading requires that the
file already exist. If it does not exist, the file pointer will be set
to NULL and can be checked by the program.
Here is a small program that reads a file and display its contents on screen.
/* Program to display the contents of a file on screen */ #include <stdio.h> void main() { FILE *fopen(), *fp; int c; fp = fopen("prog.c","r"); c = getc(fp) ; while (c!= EOF) { putchar(c); c = getc(fp); } fclose(fp); }
Writing (w)
When a file is opened for writing, it will be created if it does not
already exist and it will be reset if it does, resulting in the deletion
of any data already there. Using the w indicates that the file is
assumed to be a text file.
Here is the program to create a file and write some data into the file.
#include <stdio.h> int main() { FILE *fp; file = fopen("file.txt","w"); /*Create a file and add text*/ fprintf(fp,"%s","This is just an example :)"); /*writes data to the file*/ fclose(fp); /*done!*/ return 0; }
Appending (a)
When a file is opened for appending, it will be created if it does
not already exist and it will be initially empty. If it does exist, the
data input point will be positioned at the end of the present data so
that any new data will be added to any data that already exists in the
file. Using the a indicates that the file is assumed to be a text file.
Here is a program that will add text to a file which already exists and there is some text in the file.
#include <stdio.h> int main() { FILE *fp file = fopen("file.txt","a"); fprintf(fp,"%s","This is just an example :)"); /*append some text*/ fclose(fp); return 0; }
Outputting to the file
The job of actually outputting to the file is nearly identical to the
outputting we have already done to the standard output device. The only
real differences are the new function names and the addition of the
file pointer as one of the function arguments. In the example program,
fprintf replaces our familiar printf function name, and the file pointer
defined earlier is the first argument within the parentheses. The
remainder of the statement looks like, and in fact is identical to, the
printf statement.
Closing a file
To close a file you simply use the function fclose with the file
pointer in the parentheses. Actually, in this simple program, it is not
necessary to close the file because the system will close all open files
before returning to DOS, but it is good programming practice for you to
close all files in spite of the fact that they will be closed
automatically, because that would act as a reminder to you of what files
are open at the end of each program.
You can open a file for writing, close it, and reopen it for reading,
then close it, and open it again for appending, etc. Each time you open
it, you could use the same file pointer, or you could use a different
one. The file pointer is simply a tool that you use to point to a file
and you decide what file it will point to. Compile and run this program.
When you run it, you will not get any output to the monitor because it
doesn’t generate any. After running it, look at your directory for a
file named TENLINES.TXT and type it; that is where your output will be.
Compare the output with that specified in the program; they should
agree! Do not erase the file named TENLINES.TXT yet; we will use it in
some of the other examples in this section.
some of the other examples in this section.
Reading from a text file
Now for our first program that reads from a file. This program begins
with the familiar include, some data definitions, and the file opening
statement which should require no explanation except for the fact that
an r is used here because we want to read it.
#include <stdio.h> main( ) { FILE *fp; char c; funny = fopen("TENLINES.TXT", "r"); if (fp == NULL) printf("File doesn't exist\n"); else { do { c = getc(fp); /* get one character from the file */ putchar(c); /* display it on the monitor */ } while (c != EOF); /* repeat until EOF (end of file) */ } fclose(fp); }
In this program we check to see that the file exists, and if it does,
we execute the main body of the program. If it doesn’t, we print a
message and quit. If the file does not exist, the system will set the
pointer equal to NULL which we can test. The main body of the program is
one do while loop in which a single character is read from the file and
output to the monitor until an EOF (end of file) is detected from the
input file. The file is then closed and the program is terminated. At
this point, we have the potential for one of the most common and most
perplexing problems of programming in C. The variable returned from the
getc function is a character, so we can use a char variable for this
purpose. There is a problem that could develop here if we happened to
use an unsigned char however, because C usually returns a minus one for
an EOF – which an unsigned char type variable is not
capable of containing. An unsigned char type variable can only have the values of zero to 255, so it will return a 255 for a minus one in C. This is a very frustrating problem to try to find. The program can never find the EOF and will therefore never terminate the loop. This is easy to prevent: always have a char or int type variable for use in returning an EOF. There is another problem with this program but we will worry about it when we get to the next program and solve it with the one following that.
capable of containing. An unsigned char type variable can only have the values of zero to 255, so it will return a 255 for a minus one in C. This is a very frustrating problem to try to find. The program can never find the EOF and will therefore never terminate the loop. This is easy to prevent: always have a char or int type variable for use in returning an EOF. There is another problem with this program but we will worry about it when we get to the next program and solve it with the one following that.
After you compile and run this program and are satisfied with the
results, it would be a good exercise to change the name of TENLINES.TXT
and run the program again to see that the NULL test actually works as
stated. Be sure to change the name back because we are still not
finished with TENLINES.TXT.
File Handling
In C++ we say data flows as streams into and out of programs. There
are different kinds of streams of data flow for input and output. Each
stream is associated with a class, which contains member functions and
definitions for dealing with that particular kind of flow. For example,
the if stream class represents the input disc files,. Thus each file in
C++ is an object of a particular stream class.
The stream class hierarchy
The stream classes are arranged in a rather complex hierarchy. You do
not need to understand this hierarchy in detail to program basic file
I/O, but a brief overview may be helpful. We have already made extensive
use of some of these classes. The extraction operator >> is a
member of istream class and the insertion operator is a member of
ostream class. Both of these classes are derived from the ios class. The
cout object is a predefined object of the ostream with assign class. It
is in turn derived from ostream class. The classes used for input and
output to the video display and keyboard are declared in the header file
IOSTREAM.H, which we have routinely included in all our programs.
Stream classes
The ios class is the base class for the entire I/O hierarchy. It
contains many constants and member functions common to input and output
operations of all kinds. The istream and ostream classes are derived
from ios and are dedicated to input and output respectively Their member
functions perform both formatted and unformatted operations. The
iostream class is derived from both istream and ostream by multiple
inheritance, so that other classes can inherit both of these classes
from it. The classes in which we are most interested for file I/O are
ifstream for input files ofsteam for output files and fstream for files
that will be used for both input and output the ifstream and ofsteam
classes are declared in the FSTREAM.H. file.
The isteam class contains input functions such as
- getline( )
- getine( )
- read( )
and overloaded extraction operators.
The ostream class contains functions such as
- Put( )
- write( )
and overloaded insertor.
Writing strings into a file
Let us now consider a program which writes strings in a file.
//program for writing a string in a file #include<fstream.h> void main( ) { ofstream outfile("fl.fil");//create a file for output outfile<<"harmlessness, truthfulness, calm"<<endl; outfile<<"renunciation, absence of wrath and fault-finding"<<endl; outfile<<"compassion for all, non-covetousness, gentleness, modesty"<<endl; outfile<<"stability. vigour, forgiveness, endurance, cleanliness"<<endl; outfile<<"absence of malice and of excessive self-esteem"<<endl; outfile<<"these are the qualities of godmen"<<endl; }
In the above program, we create an object called outfile, which is a
member of the output file stream class. We initialise it to the filename
“fl.fil”. You can think of outfile as a user-chosen logical name which
is associated with the real file on disc called “fl.fil”. When any
automatic object (outfile is automatic) is defined in a function, it is
created in the function and automatically destroyed when the function
terminates. When our main ( ) function ends, outfile goes out of scope.
This automatically calls the destructor, which closes the file. It may
be noticed that we do not need to close the file explicitly by any
close-file command. The insertion operator << is overloaded in
ofsteam and works with objects defined from ofstream. Thus, we can use
it to output txt to the file. The strings are written in the file “fl.
fil? in the ASCII mode. One can see it from DOS by giving the type
command. The file “fl. fil” looks as shown below
harmlessness, truthfulness, calm renunciation, absence of wrath and
fault-finding compassion for all, non-covetousness, gentleness, modesty
stability, vigour, forgiveness, endurance, cleanliness absence of malice
and of excessive self-esteem these are the qualities of godmen.
Reading strings from file
The program below illustrates the creation of an object of ifstream class for reading purpose.
//program of reading strings
#include <fstream.h> //for file functions void main( ) { const int max = 80; //size of buffer char buffer[max]; //character buffer ifstream. infile("fl.fil")- //create file for input while (infile) //until end-of-file { infile.getline(buffer,max); //read a line of text cout<<buffer } }
We define infile as an ifstream object to input records from the file
“fl.fil”. The insertion operator does not work here. Instead, we read
the text from the file, one line at a time, using the getline( )
function. The getline ( ) function reads characters until it encounters
the ? \n? character. It places the resulting string in the buffer
supplied as an argument. The maximum size of the buffer is given as the
second argument. The contents of each line are displayed after each line
is input. Our ifstream object called infile has a value that can be
tested for various error conditions -one is the end-of-file. The program
checks for the EOF in the while loop so that it can stop reading after
the last string.
What is a buffer?
A buffer is a temporary holding area in memory which acts as an
intermediary between a program and a file or other I/0 device.
Information can be transferred between a buffer and a file using large
chunks of data of the size most efficiently handled by devices like disc
drives. Typically, devices like discs transfer information in blocks of
512 bytes or more, while program often processes information one byte
at a time. The buffer helps match these two desperate rates of
information transfer. On output, a program first fills the buffer and
then transfers the entire block of data to a hard disc, thus clearing
the buffer for the next batch of output. C++ handles input by connecting
a buffered stream to a program and to its source of input. similarly,
C++ handles output by connecting a buffered stream to a program and to
its output target.
Using put( ) and get( ) for writing and reading characters
The put ( ) and get( ) functions are also members of ostream and
istream. These are used to output and input a single character at a
time. The program shown below is intended to illustrate the use of
writing one character at a time in a file.
//program for writing characters #Include <fstream.h> #include <string.h> void main( ){ charstr[] = "do unto others as you would be done by ; ofstream outfile("f2.fil"); for(int i =0; i<strlen(str); i++) outfile put(str[i]); }
In this program, the length of the string is found by the strlen( )
function, and the characters are output using put( function in a for
loop. This file is also an ASCII file.
Reading Characters
The program shown below illustrates the reading of characters from a file.
//program for reading characters of a string #Include <fstream.h> void main( ) {char ch; ifstream in file("f2.fil")? while(infile) infile.get(ch); cout<<ch; }
The program uses the get( ) and continues to read until eof is
reached. Each character read from the file is displayed using cout. The
contents of file f2.fil created in the last program will be displayed on
the screen.
Writing an object in a files
Since C++ is an object-oriented language, it is reasonable to wonder
how objects can be written to and read from the disc. The program given
below is intended to write an object in a file.
//program for writing objects in files #include<fstream.h> class employees { protected: int empno; char name[10]; char dept[5]; char desig[5]; double basic; double deds; public: void getdata(void) {coul<<endl<<"enter empno";cin>>empno;" cout<<endl<<"enter empname";cin>>name; cout<<endl<<"enter department "; cin>>dept; cout<<endl<<"enter designation "; cin>>desig; cout<<endl<<"enter basic pay ";cin>>basic: cout<<endl<<"enter deds ";cin>>deds; } void main(void){ employees emp; emp.getdata( ); ofstream outfile("f3.fil"); outfile. write((char * )&emp,sizeof(emp)); }
This program uses a class by name employees and an object by name
emp. Data can be written only by the function inside the object. This
program creates a binary data file by name f3.fil.The write( ) function
is used for writing. The write( ) function requires two arguments, the
address of the object to be written, and the size of the object in
bytes. We use the size of operator to find the length of the emp object.
The address of the object must be cast to type pointer to char.
Binary vs. Character files
You might have noticed that write( ) was used to output binary
values, not just characters. To clarify, let us examine this further.
Our emp object contained one int data member, three string data members
and two double data members. The total number of bytes occupied by the
data members comes to 38. It is as if write( ) took a mirror image of
the bytes of information in memory and copied them directly to disc,
without bothering any intervening translation or formatting. By
contrast, the character based functions take some liberties with the
data. for example, they expand the? \n ?character into a carriage return
and a line feed before storing it to disk.
Reading object from file
The program given below is intended to read the file created in the above program.
//program f or reading data files #include < istream.h> class employees { protected: int empno; char name[I0] char dept[5]; char desig[5]; double basic; double deds; public: void showdata(void) {cout<<endl<<"employeenumber: "<<empno; cout<<endl<<"empname "<<name; cout<<endl<<"department "<<dept; cout<<endl<<"designation "<<desig; cout<<endl<<"basic pay "<<basic: cout<<endl<<"deds "<<deds;} void main(void){ employees empl; ifstream infile("f3.fil"); infile.read((char*)&empl, sizeof(empl)); empl.showdata( ); }
It may be noticed that both read( )and write( functions have similar
argument. we must specify the address in which the disc input will be
placed. We also use size of to indicate the number of bytes to be read.
The sample output looks as shown below:
employeenumber; 123 empname venkatesa department elec designation prog basic pay 567.89 deds 45.76