Knowlet

Unit 14: File Structure

1. Categories of Files

A file is a collection of related data stored on a secondary storage device. In C, files are treated as a stream of bytes. They are primarily categorized based on how the data is stored and accessed.

  • Sequential Access Files: Data must be accessed in the order it was written. To read the 10th record, you must pass through the first nine.
  • Random (Direct) Access Files: Data can be accessed from any location instantly using specific pointer manipulation functions.

2. Opening and Closing Files

Before performing any operation on a file, it must be opened, and after the work is done, it must be closed to free resources.

FILE Pointer

In C, all file operations use a special structure called FILE defined in stdio.h. We declare a pointer of this type to track the file.

FILE *fp;

fopen() Function

Used to open a file. It returns the address of the file if successful, otherwise NULL.

fp = fopen("filename.txt", "mode");

fclose() Function

Used to close the file. It ensures all data is properly written (flushed) from the buffer to the disk.

fclose(fp);

3. File Opening Modes

Modes specify the purpose for which the file is being opened.

Mode Meaning Description
"r" Read Opens an existing file for reading only.
"w" Write Creates a new file or overwrites an existing one.
"a" Append Adds data to the end of an existing file.
"r+" Read/Write Opens an existing file for both reading and writing.
"w+" Write/Read Creates a new file for both reading and writing.

4. Text vs. Binary Files

C distinguishes between two types of file formats:

  • Text Files (.txt): Store data as a sequence of characters (ASCII). They are human-readable but less efficient for numeric data. Special characters like newline (\n) may be translated.
  • Binary Files (.dat, .bin): Store data exactly as it appears in memory (0s and 1s). They are not human-readable but are highly efficient and faster for large datasets. No character translation occurs.

5. Reading, Writing, and Appending

Various functions are used to transfer data between the program and the file.

Formatted I/O

  • fprintf(): fprintf(fp, "Format", variables); - Writes formatted data to a file.
  • fscanf(): fscanf(fp, "Format", &variables); - Reads formatted data from a file.

Character I/O

  • fputc() / putc(): Writes a single character to a file.
  • fgetc() / getc(): Reads a single character from a file. Returns EOF at the end of the file.

Block I/O (Binary)

  • fwrite(): Writes a block of memory (like a structure) to a file.
  • fread(): Reads a block of memory from a file.

6. Creating Header Files

A header file is a file with a .h extension that contains C function declarations and macro definitions. It allows you to share code across multiple source files.

  1. Write your functions and definitions in a file (e.g., mymath.h).
  2. Include it in your main program using double quotes: #include "mymath.h".

7. Preprocessor Directives and Macros

The preprocessor is a tool that processes the source code before it is passed to the compiler. Directives start with a # symbol.

Common Directives

  • #include: Includes the content of a header file.
  • #define: Used to create symbolic constants or Macros.
  • #undef: Undefines an existing macro.
  • #ifdef / #ifndef: Conditional compilation directives used to include or exclude code blocks.

Macros

Definition: A macro is a fragment of code which has been given a name. Whenever the name is used, it is replaced by the contents of the macro.

Example: #define PI 3.14 (Object-like macro) or #define SQUARE(x) ((x)*(x)) (Function-like macro).

8. Exam Focus Enhancements

Exam Tips
  • EOF: Always check for EOF (End of File) while reading character-by-character to avoid infinite loops.
  • Error Handling: In exams, always check if fp == NULL after fopen(). This demonstrates robust coding.
  • Append vs Write: Remember that "w" mode deletes previous content, while "a" mode preserves it.
Common Mistakes
  • Forgetting fclose(): Failing to close a file can lead to data loss and "file locked" errors in the OS.
  • Binary Mode: Forgetting to add 'b' to the mode (e.g., "rb", "wb") when working with binary files.
  • Macro Parentheses: Forgetting parentheses in macro definitions (e.g., #define MUL(a, b) a * b). If you call MUL(2+3, 4), it becomes 2 + 3 * 4 = 14 instead of 20. Correct: ((a)*(b)).
Frequently Asked Questions

Q: What is the return value of fclose()?
A: It returns 0 on success and EOF if an error occurs during closing.

Q: Why use binary files over text files?
A: Binary files are much faster for reading/writing complex data like structures and they take up less space on the disk.

Q: What is the difference between #include <file.h> and #include "file.h"?
A: Angle brackets <> search in system directories; double quotes "" search in the local project directory first.

Did this resource help you study?

Share feedback or report issues to help improve this resource.