Unit V: Hashing

Table of Contents

1. Introduction to Hashing

Hashing is a technique used to uniquely identify a specific object from a group of similar objects. In data structures, it provides a way to store and retrieve data in O(1) time on average.


2. Hash Functions & Characteristics

A good hash function is crucial for performance.

Characteristics of a Good Hash Function

Types of Hash Functions

  1. Division Method: h(k) = k mod m (where m is the table size).
  2. Mid-Square Method: The key is squared, and the middle digits are used as the hash value.
  3. Folding Method: The key is divided into parts, which are added together to form the index.

3. Collision Resolution

A Collision occurs when two different keys produce the same hash value (index). Since a table slot can usually hold only one record, we must resolve this conflict.


4. Open Addressing (Closed Hashing)

In Open Addressing, all elements are stored within the hash table itself. If a collision occurs, we search for the next available empty slot.

Linear Probing

We search for the next slot linearly (index + 1, index + 2...).

Quadratic Probing

We search using a quadratic function (index + 1², index + 2², index + 3²...).

Double Hashing

Uses a second hash function to determine the step size for probing.

Formula: index = (hash1(key) + i * hash2(key)) mod m

5. Closed Addressing (Open Hashing / Chaining)

In Closed Addressing (commonly known as Chaining), each slot of the hash table contains a link to a linked list.


6. Exam Focus & FAQs

Exam Tips

Frequently Asked Questions

Q: What is a Load Factor?
It is the ratio of the number of stored elements to the total table size (n/m). A higher load factor increases the chance of collisions.

Q: Why is a prime number often used for table size (m)?
Using a prime number as the divisor in the Division Method helps in distributing keys more uniformly.

Mnemonic: LCD

To remember Open Addressing techniques, think LCD: Linear, C(Quadratic starts with Q, but think 'Curve'), Double Hashing.