Data Structures Trees

Trees

Hierarchical data structure for representing relationships.

Definition

Trees are nodes connected in a hierarchy, where each node can point to one or more nodes below it, starting from a single top node known as the root. Each edge, or connection, represents a parent-child relationship. Trees do not contain cycles, ensuring there is one clear path from the root to any node. If there are N nodes, then there will be at most N - 1 edges

Operations

Tree operations vary in complexity based on the tree's balance. A balanced tree, where nodes are evenly distributed, allows for operations in O(log n) time. In contrast, unbalanced trees, resembling a linked list, increase operation complexity to O(n) since more nodes need to be traversed.


Let's take a look at the time and space complexities of operations supported by a balanced tree:

OperationsTimeSpace
InsertO(log n)O(1)
DeleteO(log n)O(1)
SearchO(n)O(1)
DepthO(n)O(log n)

The rationale behind the O(log n) complexity is that the binary tree is balanced. At the root, the entire dataset of n nodes is visible. As you move to the next level (left or right child) the number of visible nodes halves to n/2. This halving process continues with each level descended, essentially narrowing down the dataset.

Tree Terminology

At the heart of a trees is the node, an individual element that holds data and can have links to left or right children, creating a system of relationships. The root is the topmost node, serving as the origin from which all other nodes descend.


This hierarchical structure ensures that each node (except the root) has a single parent node it directly descends from, while any node can be a parent as long as it has at least one child node connected to it. Nodes without any children are called leaves, marking the bottom of the tree.


The structure of a tree can be measured by the depth and height. Depth is the number of edges on the path from the root to a specific node, helping identify the level of that node within the overall structure. Height measures the longest path from a node down to a leaf, with the height of the main tree being the height of the root node.


The diameter of a tree represents the longest path between any two nodes in the tree. This path may or may not pass through the root. Calculating the diameter involves finding the height of left and right subtrees for each node and adding them together to find the maximum path length across the tree.


Subtrees represent smaller sections of the main tree. Each subtree starts with a node and consists of all its descendants. Subtrees are trees themselves, which makes recursion the best way to traverse trees.

Tree Traversal

Traversing a tree involves visiting all the nodes of the tree and performing an operation (like printing the node). The major tree traversal algorithms are Preorder, Inorder, and Postorder which use Depth-First Search (DFS). Level Order Traversal uses Breadth-First Search (BFS) since we can only process nodes one level at a time.


Preorder Traversal is defined by visiting each node before its children. The process visits the nodes in a root-left-right sequence. This approach is useful for creating a copy of the tree or representing the tree in a way that reflects the hierarchy of operations in an expression.


Inorder Traversal visits the nodes in a left-root-right sequence. This means that for every node, the traversal first visits the left child, then the node itself, and finally the right child. Inorder traversal is particularly useful for binary search trees (BSTs), as it retrieves the nodes in their sorted order.


Postorder Traversal is processing the root node after its children. The traversal sequence is left-right-root, giving it the name "post" order traversal. Postorder is great for when you need to delete nodes and resources since it ensures that child nodes are processed before their respective parent nodes.


Level Order Traversal, also known as Breadth-First Search (BFS), visits nodes level by level, starting from the root. This method traverses the tree top-down and is implemented using a queue. Level order traversal is beneficial for finding the shortest path or processing a tree in layers.

Binary Tree

Before we jump into the implementation of a binary tree, let's summarize what we have gone over. A binary tree is a tree data structure where each node has at most two children, referred to as the left child and the right child.


Binary trees have no cycles, and if there are N nodes then trees can have at most N - 1 edges. We start from the root, the top-most node, and traverse our way down using different traversal methods. Let's take a look at one implementation of a binary tree:


The main idea of the example above is using Depth-First Search (DFS) to explore each node, calculating the sum of paths from leaf to root. The important part is using a non-local variable (res), which keeps track of the maximum path sum seen during the traversal. The DFS function's primary role is to update res with the highest sum of any path found. The actual return value of DFS is only to help calculate the path sum.

Binary Search Tree

A Binary Search Tree (BST) is a special kind of binary tree that uses a sorted arrangement of nodes. Every node on the left subtree (the left child) has a value smaller than its parent node. Meanwhile, every node on the right side (the right child) has a value larger than its parent.


This sorted property increases the efficiency of search, insertion, and deletion operations. By allowing these operations to skip over half of the tree, it mirrors the principles of binary search, speeding up data management tasks.


Inorder traversal is important here because it naturally processes nodes in ascending order, given the properties of a BST where left children are smaller and right children are larger than their parent nodes. Using nonlocal variables k and res is crucial since k is decremented with each visited node, acting as a countdown to the kth element. Once k reaches zero, it means that we've reached the kth smallest element, and res is updated with the node's value.

Advanced Trees

Advanced trees, such as AVL trees, Red-Black trees, and B-trees, are designed to provide efficient search, insert, and delete operations by automatically maintaining tree balance. While they aren't common in coding interviews, it's still interesting to learn about these trees. Let's take a quick look at them to see what makes them unique:


AVL trees, named after their inventors, are a type of self-balancing binary search tree. Each node in an AVL tree maintains an extra factor — the height balance, ensuring that the difference between the heights of the left and right subtrees is no more than one. This balancing property allows AVL trees to guarantee O(log n) time complexity for search, insertion, and deletion operations, making them highly efficient for databases and lookup tables.


Red-Black trees are another form of self-balancing binary search tree, where each node has an additional color attribute (red or black). These trees have specific rules for the colors of nodes and their children, ensuring that thetree remains balanced after each insertion or deletion. This balance is less strict than AVL trees, allowing slightly faster insertions and deletions at the cost of slightly slower lookups.


B-trees are trees than can have more than two children, making them ideal for large blocks of data. A B-tree maintains balance by keeping the number of keys within each node in a specific range, allowing efficient insertion, deletion, and search operations.

Best Practices

Trees are versatile data structures used in various algorithms and applications. They are particularly useful for hierarchical data representation and efficient searching. Here are some essential tips and tricks for mastering trees:


Always check if the head/root node is empty right at the start, before you do anything else with your tree. Although this may seem redundant, it helps avoid edge cases and ensures things work properly (BFS doesn't work without initial base case). It's a simple yet easy way for ensuring your tree algorithms are efficient and error-free.

Copyright © StudyDSA. All rights reserved.