Unit 1: Arrays, Stacks, and Recursion
Table of Contents
1. Arrays
An array is a linear data structure that stores a collection of elements of the same data type in contiguous memory locations. Each element is identified by at least one array index or key.
Single-dimensional Arrays
A single-dimensional (1D) array is the simplest form, represented as a single row or column of elements.
- Declaration:
int arr[10];(This declares an integer array namedarrthat can hold 10 elements). - Indexing: Elements are accessed using an index, which starts at 0 and goes up to size-1. The first element is
arr[0], and the last isarr[9]. - Memory: If the base address (address of
arr[0]) isBand the size of the data type (e.g.,int) isWbytes, the address of any elementarr[i]can be calculated as:Address(arr[i]) = B + i * W
Multi-dimensional Arrays
A multi-dimensional array has more than one dimension. The most common is the two-dimensional (2D) array, which can be visualized as a table or grid with rows and columns.
- Declaration:
int matrix[3][4];(This declares a 2D array with 3 rows and 4 columns). - Accessing: Elements are accessed using two indices:
matrix[row][col]. For example,matrix[1][2]. - Memory Representation: Although we visualize it as a grid, memory is linear. 2D arrays are flattened into 1D in two ways:
- Row-Major Order: (Default in C/C++/Java) All elements of the first row are stored contiguously, followed by all elements of the second row, and so on.
Address Calculation (Row-Major) for
arr[i][j]in anM x Narray (M rows, N cols):Address(arr[i][j]) = B + (i * N + j) * W
- Column-Major Order: (Default in FORTRAN) All elements of the first column are stored contiguously, followed by all elements of the second column, and so on.
Address Calculation (Column-Major) for
arr[i][j]in anM x Narray:Address(arr[i][j]) = B + (j * M + i) * W
- Row-Major Order: (Default in C/C++/Java) All elements of the first row are stored contiguously, followed by all elements of the second row, and so on.
Sparse Matrices
A sparse matrix is a matrix in which most of the elements are zero. It is inefficient to store this matrix in a standard 2D array because a huge amount of memory is wasted on storing zeros.
Array Representation (Triplet Representation)
This method stores only the non-zero elements. A 2D array is created with three columns:
- Column 1: Row index of the non-zero element.
- Column 2: Column index of the non-zero element.
- Column 3: The non-zero value itself.
The first row of this representation is often used to store the metadata: (Total Rows, Total Columns, Total Non-Zero Elements).
Example:
Original Matrix (4x4):
[ 0, 0, 3, 0 ]
[ 5, 0, 0, 0 ]
[ 0, 1, 0, 0 ]
[ 0, 0, 0, 2 ]
Array (Triplet) Representation (5x3):
| Row | Col | Value |
|---|---|---|
| 4 | 4 | 4 |
| 0 | 2 | 3 |
| 1 | 0 | 5 |
| 2 | 1 | 1 |
| 3 | 3 | 2 |
Linked Representation
This method uses a linked list to store the non-zero elements. Each node in the linked list contains:
- Row index
- Column index
- Value
- Pointer to the next node
A header node can store the matrix dimensions. This is more flexible than the array representation as it doesn't require pre-allocating space for the non-zero elements.
2. Stacks
A Stack is a linear data structure that follows the LIFO (Last-In, First-Out) principle. This means the element added last is the first one to be removed. Think of it as a stack of plates.
The primary operations are:
- push(element): Adds an element to the top of the stack.
- pop(): Removes and returns the element from the top of the stack.
- peek() or top(): Returns the top element without removing it.
- isEmpty(): Checks if the stack is empty.
- isFull(): Checks if the stack is full (relevant for array implementation).
Implementing Stacks in an Array
A stack can be easily implemented using a 1D array and a variable top that keeps track of the index of the last element added.
- Initialization: Create an array
stack[MAX_SIZE]and initializetop = -1(indicating the stack is empty). - push(value):
- Check for Stack Overflow (if
top == MAX_SIZE - 1). - Increment
top(top = top + 1). - Set
stack[top] = value.
- Check for Stack Overflow (if
- pop():
- Check for Stack Underflow (if
top == -1). - Get the value:
value = stack[top]. - Decrement
top(top = top - 1). - Return
value.
- Check for Stack Underflow (if
Implementing Multiple Stacks in an Array
It is possible to implement two stacks in a single array efficiently. One stack starts from index 0 (grows right), and the second stack starts from index MAX_SIZE - 1 (grows left).
- Stack 1:
top1starts at-1and increments on push. - Stack 2:
top2starts atMAX_SIZEand decrements on push. - Overflow Condition: The stacks are full only when
top1 + 1 == top2. This setup utilizes the array space much more efficiently.
Prefix, Infix, and Postfix Expressions
These are different notations for writing arithmetic expressions.
- Infix: The standard notation where the operator is between the operands. (e.g.,
A + B)- Problem: Requires parentheses and rules of precedence (BODMAS) to define the order of evaluation.
- Prefix (Polish Notation): The operator comes before the operands. (e.g.,
+ A B) - Postfix (Reverse Polish Notation): The operator comes after the operands. (e.g.,
A B +)
Utility: Prefix and Postfix notations do not require parentheses or precedence rules. They are ideal for computer evaluation, which is why stacks are used.
| Infix | Prefix | Postfix |
|---|---|---|
| A + B | + A B | A B + |
| (A + B) * C | * + A B C | A B + C * |
| A + (B * C) | + A * B C | A B C * + |
Utility and Conversion of Expressions
Algorithm: Infix to Postfix Conversion
This is a classic stack application. We use a stack to hold operators and parentheses.
- Create an empty stack (for operators) and an empty string (for the postfix result).
- Scan the infix expression from left to right.
- If the character is an operand (e.g., A, B, 1, 2): Append it directly to the postfix string.
- If the character is an open parenthesis '(': Push it onto the stack.
- If the character is a close parenthesis ')': Pop operators from the stack and append them to the postfix string until an open parenthesis '(' is encountered. Pop and discard the '('.
- If the character is an operator:
- While the stack is not empty AND the top of the stack is an operator with higher or equal precedence than the current operator:
- Pop the operator from the stack and append it to the postfix string.
- After the loop, push the current operator onto the stack.
- After scanning the entire infix expression, pop any remaining operators from the stack and append them to the postfix string.
* and / have higher precedence than + and -). Associativity (left-to-right) is handled by the "equal precedence" part of the rule. Algorithm: Evaluation of Postfix Expression
This also uses a stack, but this time to hold operands (numbers).
- Create an empty stack (for operands).
- Scan the postfix expression from left to right.
- If the character is an operand: Push its value onto the stack.
- If the character is an operator:
- Pop the top two operands from the stack (e.g.,
operand2 = pop(),operand1 = pop()). - Perform the operation:
result = operand1 operator operand2. - Push the
resultback onto the stack.
- Pop the top two operands from the stack (e.g.,
- After scanning the entire expression, the final result will be the only value left in the stack. Pop and return it.
Example: Evaluate 2 3 + 5 *
- Scan '2': Push 2. Stack: [2]
- Scan '3': Push 3. Stack: [2, 3]
- Scan '+': Pop 3, Pop 2. Calculate 2 + 3 = 5. Push 5. Stack: [5]
- Scan '5': Push 5. Stack: [5, 5]
- Scan '*': Pop 5, Pop 5. Calculate 5 * 5 = 25. Push 25. Stack: [25]
- End of expression. Pop 25. Result is 25.
Applications of Stack
- Expression Evaluation: Infix to Postfix/Prefix conversion and evaluation.
- Function Call Management: The "Call Stack" stores activation records (local variables, return addresses) for function calls. This is how recursion is managed.
- Backtracking: Used in algorithms like maze-solving or finding paths in a graph (Depth First Search).
- Undo/Redo Operations: In text editors or software, a stack can store previous states.
- Parenthesis Checking: Validating if an expression has balanced parentheses.
Limitations of Array Representation of Stack
- Fixed Size: The array size is declared at compile time (static). This leads to two problems:
- Stack Overflow: If the stack size is too small, pushing a new element will fail.
- Wasted Memory: If the stack size is too large, but only a few elements are ever used, the remaining space is wasted.
- Cost of Resizing: While dynamic arrays can be used, resizing (creating a new, larger array and copying all elements) is an O(n) operation, which can be slow.
(These limitations are solved by using a Linked List to implement a stack, as seen in Unit 2).
3. Recursion
Recursion is a programming technique where a function calls itself, either directly or indirectly. A recursive function solves a problem by breaking it down into smaller, simpler versions of the same problem.
Recursive Definitions and Implementation
A recursive definition (and its implementation) must have two parts:
- Base Case (or Terminating Condition): This is the simplest version of the problem, which can be solved directly without further recursion. It stops the function from calling itself infinitely.
- Recursive Step: This part breaks the problem into a smaller subproblem and calls the function itself to solve that subproblem. It must make progress toward the base case.
Example: Factorial
Problem: Find the factorial of a non-negative integer n (n!).
- Recursive Definition:
- n! = 1, if n = 0 (Base Case)
- n! = n * (n-1)!, if n > 0 (Recursive Step)
- Implementation (Pseudocode):
function factorial(n): if n == 0: return 1 // Base Case else: return n * factorial(n - 1) // Recursive Step
Example: Fibonacci Series
Problem: Find the n-th Fibonacci number (0, 1, 1, 2, 3, 5, 8...)
- Recursive Definition:
- Fib(n) = 0, if n = 0 (Base Case 1)
- Fib(n) = 1, if n = 1 (Base Case 2)
- Fib(n) = Fib(n-1) + Fib(n-2), if n > 1 (Recursive Step)
- Implementation (Pseudocode):
function fibonacci(n): if n == 0: return 0 else if n == 1: return 1 else: return fibonacci(n - 1) + fibonacci(n - 2)
Advantages and Limitations of Recursion
Advantages:
- Elegance and Simplicity: Recursive solutions can be very clean and easy to read for problems that are naturally recursive (e.g., tree traversals, Tower of Hanoi).
- Problem Solving: It's a powerful tool for solving complex problems by breaking them down (Divide and Conquer strategy).
Limitations:
- Performance Overhead: Every function call adds an activation record to the call stack. This consumes memory and time. An iterative solution (using loops) is almost always faster and uses less memory.
- Stack Overflow: If the recursion is too deep (e.g.,
factorial(100000)), the call stack can run out of memory, crashing the program. - Redundant Computations: The Fibonacci example is a classic case of inefficiency.
fibonacci(5)callsfibonacci(4)andfibonacci(3). Butfibonacci(4)*also* callsfibonacci(3), leading to the same subproblem being solved multiple times. (This can be fixed with memoization).