Unit 3: Trees

Table of Contents

1. Introduction to Tree as a Data Structure

A Tree is a non-linear, hierarchical data structure that consists of nodes connected by edges. It is one of the most important and widely used data structures.

Unlike linked lists, where each node points to one successor, a tree node can point to multiple successors (children).

Key Terminology

A Binary Tree is a specific type of tree in which every node can have at most two children: a left child and a right child.

Types of Binary Trees

2. Binary Search Trees (BST)

A Binary Search Tree is a binary tree with a special property that makes searching very efficient.

Properties

For any node N in a BST:
  1. All values in N's left subtree are less than N's value.
  2. All values in N's right subtree are greater than N's value.
  3. Both the left and right subtrees must also be binary search trees.

This property allows for fast searching, insertion, and deletion, with an average-case time complexity of O(log n). The worst case (for a skewed tree) is O(n).

Insertion in BST

To insert a new value:

  1. Start at the root.
  2. Compare the new value with the root's value.
  3. If the new value is less, move to the left child.
  4. If the new value is greater, move to the right child.
  5. Repeat this process until you reach a NULL pointer (an empty spot).
  6. Insert the new node at that empty spot.

Deletion in BST

This is the most complex operation. There are three cases:

  1. Case 1: Node to be deleted is a Leaf (0 children)
    • Simply remove the node (set its parent's pointer to NULL).
  2. Case 2: Node to be deleted has 1 child
    • "Bypass" the node: Link the node's parent directly to the node's child.
  3. Case 3: Node to be deleted has 2 children
    1. Find the node's "successor." This can be either:
      • The Inorder Successor: The smallest node in its right subtree (go right once, then all the way left).
      • The Inorder Predecessor: The largest node in its left subtree (go left once, then all the way right).
    2. Copy the value of the successor (or predecessor) into the node to be deleted.
    3. Delete the successor (or predecessor) node (which is an easier Case 1 or Case 2 deletion).

Recursive and Iterative Traversals

Traversal means visiting every node in the tree exactly once. The three main types are (L=Left, R=Right, P=Process/Visit Node):

1. Inorder Traversal (L-P-R)

2. Preorder Traversal (P-L-R)

3. Postorder Traversal (L-R-P)

Exam Tip: Given a preorder and inorder traversal, you can uniquely reconstruct a binary tree. This is a very common exam question.

3. Threaded Binary Trees

In a standard binary tree, there are many NULL pointers (especially at leaf nodes). A threaded binary tree re-uses these NULL pointers to "thread" the tree, making traversal more efficient.

A NULL right child pointer is replaced with a pointer (a "thread") to the node's Inorder Successor. A NULL left child pointer is replaced with a pointer to the node's Inorder Predecessor.

Two flag bits are needed in each node to indicate if the left/right pointers are normal child links or threads.

Insertion, Deletion, and Traversals

4. Height-Balanced Trees (AVL Trees)

A standard BST has a worst-case time complexity of O(n) if it becomes "skewed" (like a linked list). An AVL Tree (named after its inventors, Adelson-Velsky and Landis) is a self-balancing binary search tree.

Balance Factor

An AVL tree ensures that the heights of the two child subtrees of any node differ by at most one.

Balance Factor (BF) = Height(Left Subtree) - Height(Right Subtree)

For a tree to be an AVL tree, every single node must have a Balance Factor of -1, 0, or 1.

If an insertion or deletion causes any node's BF to become -2 or +2, the tree is "unbalanced" and must be "rebalanced."

Rotations (Operations)

Rebalancing is done using rotations. A rotation is a local operation that changes the structure of the tree to restore balance, while *preserving* the BST property.

There are four imbalance cases, which require two types of rotations:

1. Left-Left (LL) Case (BF = +2)

2. Right-Right (RR) Case (BF = -2)

3. Left-Right (LR) Case (BF = +2)

4. Right-Left (RL) Case (BF = -2)

By performing these rotations, the AVL tree guarantees that its height remains O(log n), so all operations (search, insert, delete) are O(log n) in the worst case.

5. Heap Tree

A Heap is a specialized tree-based data structure. It is a complete binary tree that satisfies the heap property.

A heap is NOT a Binary Search Tree. There is no left/right ordering property.

Heap Property:

Representation: Because heaps are complete binary trees, they are almost always stored in an array, not with pointers. For a node at index i:

Operations

Operations involve adding/removing elements and then "heapifying" (fixing the heap property).

1. Insertion (O(log n))

  1. Add the new element to the *end* of the array (the next open spot in the complete tree).
  2. "Bubble Up" (or Heapify-Up): Compare the new element with its parent.
  3. If it violates the heap property (e.g., in a Max-Heap, it's larger than its parent), swap them.
  4. Repeat this process (comparing and swapping with the parent) until the heap property is restored or the node reaches the root.

2. Deletion (Extract-Max / Extract-Min) (O(log n))

  1. The element to be removed is always the root (index 0).
  2. Store the root's value (to return it).
  3. Move the last element in the array to the root (index 0).
  4. "Bubble Down" (or Heapify-Down):
    • Compare the new root with its children.
    • If it violates the heap property, swap it with its larger child (for Max-Heap) or smaller child (for Min-Heap).
    • Repeat this process (comparing and swapping down the tree) until the heap property is restored or the node becomes a leaf.

Applications: Heaps are the default implementation for Priority Queues (Unit 2). They are also used in the HeapSort algorithm (Unit 4).