CST 370 Week 6 Learning Journal Entry

- February 16, 2025

Trees, Heaps, and Hashing

This week we learned about four types of data structures: AVL trees, 2-3 trees, heaps, and hashing.

AVL Trees

AVL Trees are balanced binary search trees that have a balance factor. Balance factors are computed by taking the difference between the height of the left subtree and the height of the right subtree. An AVL tree is said to be balanced if the balance factor is -1, 0, or 1. If the tree is unbalanced, one of four rotations can be used to transform the binary search tree into a balanced tree: Left Rotation, Right Rotation, Left-Right Rotation, or Right-Left Rotation. Below is how to determine what rotation is needed:

Left Rotation: The root has a negative number smaller than -1.

Right Rotation: The root has a positive number greater than 1.

Left-Right Rotation: The root has a positive number greater than 1, and the node on the left has a negative number. In this case, we rotate the node with the negative number to the left and then rotate the root with the positive number greater than 1 to the right.

Right-Left Rotation: The root has a negative number smaller than -1, and the node on the right has a positive number. In this case, we rotate the node on the right with the positive number to the right and then rotate the root with the negative number less than 1 to the left.

Below is an example of each rotation.

2-3 Trees

In addition to AVL trees, we also learned about 2-3 trees. 2-3 trees allow each node to have one value (2-node) or two values (3-node). If a node has two values, the left subtree has values that are less than the lesser of the two parent nodes, the right subtree has values that are greater than the greater of the two parent nodes, and the middle subtree contains values that are between the two parent nodes. When constructing a 2-3 tree, if a third value is added to a node, the middle value is promoted to be a parent node of the other two. This process is shown in the picture below:

Heaps

Heaps are considered to be complete binary trees where two specific conditions are met:

every level is filled except the last level
all nodes in the last level are far left

Heaps can be classified as a max heap or a min-heap. In a max heap, the parent nodes are greater than each of the child nodes. In a min heap, the parent nodes are smaller than each of the child nodes. In these trees, there is no requirement that the lesser of the children is on the left and the greater of the children is on the right as it is in an AVL tree. An example of the two can be found below.

Hashing

A hash table is a data structure that uses a technique called hashing to store data. This technique consists of three components:

keys (unique pieces of data)
a hash function (key%num)
hash table (where keys are stored)

The hash function determines where a key is stored in the hash table. For example, if I wanted to store 8 in a table of size 5, I would use the hash function (8%5 =3) to find that 8 should be stored in position 3 of the hash table.

In addition to these components, we also need to take into consideration what to do if two keys need to be stored at the same position of the hash table. For example, we know that the key of 8 is stored at position 3 in a table of size, but so is a key of 3 since 3%5=3. If this happens we can take two approaches to store the key of 3:

separate chaining - the key of 3 is appended to position 3 after key 8
linear probing - we search the table for the next available position and store 3 in that position

Linear Probing

Separate Chaining

In addition to these processes, we also need to keep track of a load factor as we insert keys into a hash. The load factor is computed by dividing the number of keys in the hash table and the table size. We can compute the load factor and compare it to a predefined load. We can check if the load factor is greater than the predefined load factor and if it is, we can double the table size and increase it to the next prime number. The previous keys are then entered into the new hash table and the keys can continue to be inserted. This process is called rehashing. An example of this can be found below.

Search This Blog

Edward's CS Journey