Trees

Hierarchical data structure for representing relationships.

Definition

A tree is a hierarchical data structure made of nodes, connected by lines called edges. It starts with a single root node at the top. Each node can have zero or more child nodes beneath it, forming distinct levels. Trees don't contain any cycles, which means there's only one way to get from the root to any other node. This makes trees great for representing hierarchy, such as file systems or family trees, allowing for efficient searching and data organization.

Operations

Tree operations work differently depending on how the tree is structured. In a balanced tree, operations like inserting, deleting, and searching usually take O(log n) time, where n is the number of nodes. This is efficient because we can navigate the tree's levels quickly. But in an unbalanced tree, these actions might take up to O(n) time in the worst case, as we might need to check every node. Let's look at common operations take in a balanced binary tree:

OperationsTimeSpace
InsertO(n)O(h)
DeleteO(n)O(h)
SearchO(n)O(h)
Get HeightO(n)O(h)
Pre-orderO(n)O(h)
In-orderO(n)O(h)
Post-orderO(n)O(h)
Level-orderO(n)O(w)

The O(log n) complexity in balanced binary trees comes from their structure. At the root, all nodes are accessible. As we move down each level, the number of nodes we need to consider is halved, from n to n/2. The number of times we can halve n before reaching 1 is actually the definition of log₂(n). This narrowing process allows us to quickly focus on a small portion of the tree, resulting in logarithmic time complexity for many operations.

Tree Nodes

Tree nodes are the fundamental building blocks of a tree. For example, each node in a binary tree typically contains three main components:

  1. The val holds the actual data you want to store, such as a number or a string
  2. The left pointer refers to the left child node
  3. The right pointer refers to the right child node

If a node has no children, its left and right pointers are set to null. This structure allows us to create parent-child relationships between nodes, forming the tree structure:

Tree Nodes

# Definition of a binary tree node
class TreeNode:
    def __init__(self, val: int = 0, left = None, right = None):
        self.val = val
        self.left = left
        self.right = right
Tree Structure

A tree is composed of nodes connected by edges, starting with a single root node at the top and branching out to child nodes below. This organization creates parent-child relationships and levels, with key elements including:

  1. Root: The topmost node of the tree, with no parent
  2. Parent: A node with one or more child nodes below it
  3. Child: A node connected to a parent node above it
  4. Leaf: A node with no children, found at the bottom of the tree
  5. Subtree: A small section of the tree that forms a tree within the larger tree

Trees can have varying structures depending on their specific type and use case. For instance, a node in a general tree can have any number of children, not just two. The structure of a tree affects its properties and performance. For example, in some specialized trees, maintaining a balanced structure (where subtrees have similar depths) can lead to more efficient operations. The specific benefits and characteristics depend on the type of tree being used.

Binary Tree

A binary tree is a hierarchical data structure where each tree node can have up to two children: a left child and a right child. The two fundamental rules of binary trees are:

Binary Tree Rules

  1. No cycles

    Binary trees cannot have any cycles. This means that there's only one path from the root to any node. You can't start at a node, follow some edges, and end up back at the same node. This rule ensures that the tree structure is maintained and prevents infinite loops when traversing the tree.

  2. N - 1 Edges

    A binary tree containing N nodes has exactly N - 1 edges. This rule comes from the fact that every node (except the root) has exactly one edge connecting it to its parent. Since there are N nodes and all but one (the root) have a parent, there must be N - 1 edges.

Binary trees are fundamental to many advanced tree structures like Binary Search Trees (BST) and AVL trees. They're used in various applications including expression parsing, data compression, and for efficient searching and sorting algorithms. Binary trees have several special types, each with unique properties:

  • Full Binary Tree: Each node has either 0 or 2 children
  • Complete Binary Tree: All levels are fully filled except possibly the last, which is filled from left to right
  • Perfect Binary Tree: All parent nodes have two children and all leaves are at the same level
  • Balanced Binary Tree: The height of the left and right subtrees of every node differs by at most one

The efficiency of operations on a binary tree often depends on its balance. In a balanced tree, operations like search, insert, and delete typically have a time complexity of O(log n), where n is the number of nodes. However, in the worst case (a completely unbalanced tree), these operations can degrade to O(n) time complexity.

Binary Search Tree

A Binary Search Tree (BST) is a special binary tree with a unique property: for every tree node, all values in its left subtree are smaller, and all values in its right subtree are larger than the node's own value. This rule applies to every node in the tree, creating an ordered structure that's highly efficient for searching, inserting, and deleting values.

Binary Search Tree Rules

  1. Ordered Values

    For any node, all values in its left subtree are smaller, and all values in its right subtree are larger. This ordering allows for efficient searching and maintains the tree's structure.

  2. No Duplicate Values

    Binary search trees typically don't allow duplicate values. If duplicates are needed, they're usually handled by adding extra information to nodes or by defining a consistent rule for placement.

The ordered structure of BSTs allows for fast searching, insertion, and deletion of nodes, typically in O(log n) time on average, where n is the number of nodes in the tree. BSTs also support in-order traversal, which visits nodes in ascending order of their values, useful for tasks like sorting:

OperationsTimeSpace
InsertO(h)*O(1)
DeleteO(h)*O(1)
SearchO(h)*O(1)
Get MinimumO(h)*O(1)
Get MaximumO(h)*O(1)
In-orderO(n)O(h)

It's important to remember that these O(log n) time complexities are average-case scenarios. In the worst case, when the tree becomes unbalanced or skewed, the time complexity can reduce to O(n), similar to a linked list. To maintain balance and ensure O(log n) performance, self-balancing trees like AVL and Red-Black were developed.

Tree Depth & Height

The depth of a node in a binary tree is the number of edges from the root to that node. In simpler terms, it's how many steps you need to take from the top of the tree to reach the node. The root node has a depth of 0, its immediate children have a depth of 1, and so on. As we move down the tree, the depth increases by 1 at each level.


The height of a binary tree is the number of edges on the longest path from the root to a leaf. It's the maximum number of steps from the top of the tree to the furthest leaf. A tree with a single node has a height of 0, while an empty tree is typically defined to have a height of -1. The height of the tree is equal to the maximum depth of any node in the tree.

Binary Tree Depth

# Definition for a binary tree node
class TreeNode:
    def __init__(self, val: int = 0, left = None, right = None):
        self.val = val
        self.left = left
        self.right = right

class Solution:
    def maxDepth(self, root: Optional[TreeNode]) -> int:
        # Base case: If there is no root, just return 0
        if not root:
            return 0
        
        # Setting up DFS on tree
        def dfs(root: Optional[TreeNode]) -> int:
            # Base case: If root doesn't exist, then return 0
            if not root:
                return 0
            
            left = dfs(root.left)
            right = dfs(root.right)

            return max(left, right) + 1
        
        # Call the DFS function
        return dfs(root)

To calculate the depth and height of a binary tree, we use a post-order traversal approach. We start at the bottom of the tree and work our way up. For each node, we find the maximum depth of its left and right children, then add 1 to account for the current node. This way, we build up the depth information from the leaves to the root, ensuring we capture the maximum depth (which is the height of the tree) along the way.

Binary Tree Diameter

The diameter of a binary tree is the number of nodes on the longest path between any two nodes in the tree. This path may or may not pass through the root. In other words, it's the maximum distance between any two leaf nodes. For a tree with a single node, the diameter is 1, and for an empty tree, it's typically defined as 0.

Tree Diameter

# Definition for a binary tree node
class TreeNode:
    def __init__(self, val: int = 0, left = None, right = None):
        self.val = val
        self.left = left
        self.right = right

class Solution:
    def diameterOfBinaryTree(self, root: Optional[TreeNode]) -> int:
        # Set up nonlocal variable res to store answer
        res = 0

        # Setting up DFS on binary tree
        def dfs(root: Optional[TreeNode]) -> int:
            # Base case: If there is no root, then return 0
            if not root:
                return 0
            
            # DFS function calls for left and right children
            left = dfs(root.left)
            right = dfs(root.right)

            # If value of left + right children is greater, update res
            nonlocal res
            if (left + right) > res:
                res = (left + right)

            return 1 + max(left, right)
        
        # Call DFS function to store answer in res
        dfs(root)
        return res

To find the diameter, we need to identify the two nodes that are farthest apart so we use a post-order traversal approach. Our recursive function computes two values at each node: the height of the current subtree and the maximum diameter found so far. For each node, we calculate the path length by adding the heights of its left and right subtrees plus 1 for the current node. This recursive method efficiently finds the diameter in a single pass through the tree, examining all possible paths to find the longest one without needing to store or revisit nodes.

Best Practices

Trees are versatile data structures used in various algorithms and applications. They are particularly useful for hierarchical data representation and efficient searching. Here are some essential tips and tricks for mastering trees:


Always check if the head/root node is empty right at the start, before you do anything else with your tree. Although this may seem redundant, it helps avoid edge cases and ensures things work properly (BFS doesn't work without initial base case). It's a simple yet easy way for ensuring your tree algorithms are efficient and error-free.

Copyright © StudyDSA. All rights reserved.