Unit 4: Searching and Sorting
Table of Contents
1. Searching
Searching is the process of finding a particular element (the "key") within a collection of elements (like an array or list).
Linear Search
This is the simplest search algorithm. It sequentially checks each element in the list until the target key is found or the end of the list is reached.
- Algorithm:
- Start from the first element (index 0).
- Compare the current element with the
key. - If they match, return the current index (element found).
- If they don't match, move to the next element.
- If the end of the list is reached, return -1 (element not found).
- Prerequisite: None. The list does not need to be sorted.
- Time Complexity:
- Best Case: O(1) (Key is the first element).
- Worst Case: O(n) (Key is the last element or not present).
- Average Case: O(n).
Binary Search
This is a much more efficient search algorithm, but it has one major prerequisite: the list must be sorted.
It works on the "Divide and Conquer" principle.
- Algorithm (Iterative):
- Set
low = 0andhigh = n - 1(wherenis the number of elements). - Loop while
low <= high: - Calculate
mid = floor((low + high) / 2). - Compare the element at
arr[mid]with thekey:- If
arr[mid] == key, returnmid(element found). - If
arr[mid] < key, the key must be in the right half. Setlow = mid + 1. - If
arr[mid] > key, the key must be in the left half. Sethigh = mid - 1.
- If
- If the loop finishes (
low > high), return -1 (element not found).
- Set
- Prerequisite: The array must be sorted.
- Time Complexity:
- Best Case: O(1) (Key is the middle element).
- Worst Case: O(log n) (Key is at the end or not present).
- Average Case: O(log n).
mid = low + (high - low) / 2 is a safer way to write mid = (low + high) / 2 for very large arrays. Comparison of Linear and Binary Search
| Feature | Linear Search | Binary Search |
|---|---|---|
| Prerequisite | None. Works on sorted or unsorted data. | Data must be sorted. |
| Time Complexity | O(n) (linear) | O(log n) (logarithmic) |
| Speed | Slow for large datasets. | Very fast, even for large datasets. |
| Data Structure | Works on arrays and linked lists. | Works efficiently only on arrays (for O(1) access to the mid element). |
2. Sorting
Sorting is the process of arranging elements in a collection in a specific order (e.g., ascending or descending).
Key Sorting Concepts (Stable vs. Unstable)
- In-Place Sort: An algorithm that sorts the data using only a small, constant amount of extra memory (O(1) space). It modifies the original array directly. (e.g., Bubble, Selection, Insertion, Quick Sort).
- Out-of-Place Sort: An algorithm that requires extra memory (e.g., O(n) space) to hold temporary data. (e.g., Merge Sort).
- Stable Sort: A sorting algorithm is stable if it preserves the relative order of equal-valued elements.
Example: If you have
(5, 'apple')and(5, 'banana')and you sort by number, a stable sort will guarantee that 'apple' comes before 'banana' in the output. An unstable sort might swap them.Stable: Bubble, Insertion, Merge Sort.
Unstable: Selection, Quick Sort.
Bubble Sort
The simplest sorting algorithm. It repeatedly steps through the list, compares adjacent elements, and swaps them if they are in the wrong order. This "bubbles" the largest elements to the end.
- Algorithm:
- Outer loop from
i = 0ton-2. - Inner loop from
j = 0ton-i-2. - Compare
arr[j]andarr[j+1]. - If
arr[j] > arr[j+1], swap them.
- Outer loop from
- Time Complexity:
- Best Case: O(n) (if optimized and array is already sorted).
- Worst/Average Case: O(n²)
- Space: O(1) (In-place).
- Stable: Yes.
Selection Sort
This algorithm divides the list into a sorted sublist (at the beginning) and an unsorted sublist.
- Algorithm:
- Outer loop from
i = 0ton-2. (iis the boundary of the sorted sublist). - Assume
min_index = i. - Inner loop from
j = i+1ton-1to find the minimum element in the unsorted sublist. - If a smaller element is found, update
min_index. - After the inner loop, swap
arr[i]witharr[min_index].
- Outer loop from
- Time Complexity: O(n²) (Best, Worst, and Average). It always performs (n-1) passes.
- Space: O(1) (In-place).
- Stable: No (by default).
Insertion Sort
Builds the final sorted array one item at a time. It's like sorting a hand of playing cards. It takes an element from the unsorted part and "inserts" it into its correct position in the sorted part.
- Algorithm:
- Outer loop from
i = 1ton-1. (Element atarr[i]is thekeyto be inserted). - Store
key = arr[i]. - Inner loop (
j = i - 1) moves backward (j >= 0). - While
arr[j] > key, shift the element:arr[j+1] = arr[j]and decrementj. - When the loop stops, insert the
key:arr[j+1] = key.
- Outer loop from
- Time Complexity:
- Best Case: O(n) (If the array is already sorted).
- Worst/Average Case: O(n²)
- Space: O(1) (In-place).
- Stable: Yes.
Merge Sort
A "Divide and Conquer" algorithm. It's one of the most efficient sorting algorithms.
- Algorithm:
- Divide: Recursively divide the array into two halves until you have 1-element arrays (which are trivially sorted).
- Conquer: Merge the sorted subarrays back together. The
mergefunction is the core:- It takes two sorted subarrays.
- It creates a temporary array.
- It compares the first elements of both subarrays, copies the smaller one to the temp array, and advances that subarray's pointer.
- It repeats this until one subarray is empty, then copies the rest of the other subarray.
- Finally, it copies the merged (and sorted) data from the temp array back into the original array.
- Time Complexity: O(n log n) (Best, Worst, and Average).
- Space: O(n) (Out-of-place, for the temporary array).
- Stable: Yes.
Quick Sort
Another "Divide and Conquer" algorithm. It is generally the fastest sorting algorithm in practice.
- Algorithm:
- Divide: Pick an element from the array as the pivot (e.g., first, last, or random).
- Conquer: Partition the array. Reorder the array so that all elements smaller than the pivot are to its left, and all elements greater are to its right. The pivot is now in its final sorted position.
- Recurse: Recursively apply Quick Sort to the two subarrays (the one to the left of the pivot and the one to the right).
- Time Complexity:
- Best/Average Case: O(n log n).
- Worst Case: O(n²) (Occurs if the pivot is always the smallest or largest element, e.g., on an already-sorted array).
- Space: O(log n) (In-place sort, but uses O(log n) stack space for recursion).
- Stable: No.
Comparison of Sorting Techniques
| Algorithm | Time (Best) | Time (Average) | Time (Worst) | Space | Stable |
|---|---|---|---|---|---|
| Bubble Sort | O(n) | O(n²) | O(n²) | O(1) | Yes |
| Selection Sort | O(n²) | O(n²) | O(n²) | O(1) | No |
| Insertion Sort | O(n) | O(n²) | O(n²) | O(1) | Yes |
| Merge Sort | O(n log n) | O(n log n) | O(n log n) | O(n) | Yes |
| Quick Sort | O(n log n) | O(n log n) | O(n²) | O(log n) | No |
- For small or nearly sorted lists: Use Insertion Sort.
- For guaranteed O(n log n) performance and stability: Use Merge Sort.
- For the fastest average-case performance: Use Quick Sort (with a good pivot strategy).