Unit 14: File Structure
Contents
1. Categories of Files
A file is a collection of related data stored on a secondary storage device. In C, files are treated as a stream of bytes. They are primarily categorized based on how the data is stored and accessed.
- Sequential Access Files: Data must be accessed in the order it was written. To read the 10th record, you must pass through the first nine.
- Random (Direct) Access Files: Data can be accessed from any location instantly using specific pointer manipulation functions.
2. Opening and Closing Files
Before performing any operation on a file, it must be opened, and after the work is done, it must be closed to free resources.
FILE Pointer
In C, all file operations use a special structure called FILE defined in stdio.h. We declare a pointer of this type to track the file.
FILE *fp;
fopen() Function
Used to open a file. It returns the address of the file if successful, otherwise NULL.
fp = fopen("filename.txt", "mode");fclose() Function
Used to close the file. It ensures all data is properly written (flushed) from the buffer to the disk.
fclose(fp);
3. File Opening Modes
Modes specify the purpose for which the file is being opened.
| Mode | Meaning | Description |
|---|---|---|
| "r" | Read | Opens an existing file for reading only. |
| "w" | Write | Creates a new file or overwrites an existing one. |
| "a" | Append | Adds data to the end of an existing file. |
| "r+" | Read/Write | Opens an existing file for both reading and writing. |
| "w+" | Write/Read | Creates a new file for both reading and writing. |
4. Text vs. Binary Files
C distinguishes between two types of file formats:
- Text Files (.txt): Store data as a sequence of characters (ASCII). They are human-readable but less efficient for numeric data. Special characters like newline (\n) may be translated.
- Binary Files (.dat, .bin): Store data exactly as it appears in memory (0s and 1s). They are not human-readable but are highly efficient and faster for large datasets. No character translation occurs.
5. Reading, Writing, and Appending
Various functions are used to transfer data between the program and the file.
Formatted I/O
- fprintf():
fprintf(fp, "Format", variables);- Writes formatted data to a file. - fscanf():
fscanf(fp, "Format", &variables);- Reads formatted data from a file.
Character I/O
- fputc() / putc(): Writes a single character to a file.
- fgetc() / getc(): Reads a single character from a file. Returns
EOFat the end of the file.
Block I/O (Binary)
- fwrite(): Writes a block of memory (like a structure) to a file.
- fread(): Reads a block of memory from a file.
6. Creating Header Files
A header file is a file with a .h extension that contains C function declarations and macro definitions. It allows you to share code across multiple source files.
- Write your functions and definitions in a file (e.g.,
mymath.h). - Include it in your main program using double quotes:
#include "mymath.h".
7. Preprocessor Directives and Macros
The preprocessor is a tool that processes the source code before it is passed to the compiler. Directives start with a # symbol.
Common Directives
- #include: Includes the content of a header file.
- #define: Used to create symbolic constants or Macros.
- #undef: Undefines an existing macro.
- #ifdef / #ifndef: Conditional compilation directives used to include or exclude code blocks.
Macros
Definition: A macro is a fragment of code which has been given a name. Whenever the name is used, it is replaced by the contents of the macro.
Example: #define PI 3.14 (Object-like macro) or #define SQUARE(x) ((x)*(x)) (Function-like macro).
8. Exam Focus Enhancements
- EOF: Always check for
EOF(End of File) while reading character-by-character to avoid infinite loops. - Error Handling: In exams, always check if
fp == NULLafterfopen(). This demonstrates robust coding. - Append vs Write: Remember that "w" mode deletes previous content, while "a" mode preserves it.
- Forgetting fclose(): Failing to close a file can lead to data loss and "file locked" errors in the OS.
- Binary Mode: Forgetting to add 'b' to the mode (e.g., "rb", "wb") when working with binary files.
- Macro Parentheses: Forgetting parentheses in macro definitions (e.g.,
#define MUL(a, b) a * b). If you callMUL(2+3, 4), it becomes2 + 3 * 4 = 14instead of20. Correct:((a)*(b)).
Q: What is the return value of fclose()?
A: It returns 0 on success and EOF if an error occurs during closing.
Q: Why use binary files over text files?
A: Binary files are much faster for reading/writing complex data like structures and they take up less space on the disk.
Q: What is the difference between #include <file.h> and #include "file.h"?
A: Angle brackets <> search in system directories; double quotes "" search in the local project directory first.