AlgorithmsKruskal's Algorithm

Kruskal's Algorithm

A greedy algorithm that builds a minimum spanning tree by adding edges in order of weight.

Definition

Prim builds a minimum spanning tree by growing one blob outward. Kruskal gets the identical result from the opposite direction: sort all the edges once, then sweep them cheapest-first, keeping every edge that connects two separate fragments and rejecting every edge that would close a cycle. The work-in-progress is a forest of islands that merge into one tree, and the total cost is O(E log E), almost all of it the sort.

The correctness comes from the same cut property that licenses Prim: each accepted edge is the lightest one crossing the cut between its two fragments and everything else, so the greed is safe at every pick. What's genuinely new here is the machinery for the cycle check.

Bridges Between Islands

Run it on the same five-node graph from the Prim page: A-B = 2, B-C = 3, B-E = 5, A-D = 6, C-E = 7, B-D = 8, D-E = 9, already listed in sorted order. The sweep takes A-B, takes B-C, takes B-E, and takes A-D, at which point four edges connect all five nodes. Predict the fate of C-E at 7 before reading on.

Rejected: C and E already live in the same fragment (connected through B), so the edge would close the cycle C-B-E-C and add weight for nothing. The final tree weighs 2 + 3 + 5 + 6 = 16, the exact tree Prim built on this graph, which is no accident: when all edge weights are distinct, the minimum spanning tree is unique, and every correct algorithm must find it.

Cycle Checks in Near-Constant Time

The sweep asks one question per edge: are these two endpoints already connected through the fragments built so far? Walking the forest to answer would cost O(V) per edge. Union-find answers it in effectively constant time: every node points toward its fragment's root, two nodes share a fragment exactly when they share a root, and merging fragments is one pointer assignment between roots.

The speed depends on keeping those pointer chains short, which is what path compression does: while walking up to find a root, repoint each visited node higher up the chain. With compression in place, the amortized cost per operation involves the inverse Ackermann function, a value below 5 for any input that fits in this universe, and the cycle checks effectively vanish from the complexity budget.

kruskal's_algorithm.py

def kruskal(n: int, edges: list[tuple[int, int, int]]) -> int:
    # edges hold (weight, node_a, node_b); returns MST weight or -1
    parent = list(range(n))

    def find(node: int) -> int:
        # Path compression: point at the grandparent while walking up
        while parent[node] != node:
            parent[node] = parent[parent[node]]
            node = parent[node]
        return node

    total = 0
    used = 0

    # Cheapest bridges first
    for weight, a, b in sorted(edges):
        root_a, root_b = find(a), find(b)

        # Same root means this edge would close a cycle
        if root_a == root_b:
            continue

        parent[root_a] = root_b
        total += weight
        used += 1

        # A spanning tree is complete at n - 1 edges
        if used == n - 1:
            return total

    return -1

Prim or Kruskal

You might think one of the two must dominate the other, since they solve the identical problem. They split the territory by input shape instead. Kruskal wants a flat edge list and shines on sparse graphs, since its cost follows the edge count. Prim wants adjacency structure and shines on dense graphs, where it skips sorting E ≈ V² edges. With distinct weights they return the same tree, with ties they may differ edge-by-edge, and the total weight always matches to the digit.

One boundary they share: both assume undirected edges. Both also assume connectivity, and Kruskal's failure mode is the more graceful one, since its union counter ends below V - 1 and announces the graph was never connectable.

Kruskal: Step-by-Step

Sort, sweep, union. The details that make it correct:

1. Sort Every Edge Once

The sort is the algorithm's entire cost ceiling: O(E log E). Everything after it is a single cheap pass, so resist any urge to re-sort, re-scan, or use a heap.

2. Start With V Single-Node Islands

Union-find begins with every node as its own parent. Unlike Prim's single growing tree, Kruskal's work-in-progress is a forest of fragments, and that's healthy, not broken.

3. Bridge Only Different Roots

For each edge, find both endpoints' roots. Different roots means the edge connects two separate fragments: take it. The same root means both ends already touch, and the edge would close a cycle.

4. Union the Fragments and Count

Point one root at the other and increment the edge counter. The counter doubles as a connectivity meter: every union means one fewer fragment in the forest.

5. Stop at V Minus One Bridges

A spanning tree holds exactly V - 1 edges, so the moment the counter gets there, every remaining edge is a guaranteed reject and the loop can exit early.

Common Patterns

The MST problem set runs on costumes over the same body. Connecting Cities is the formula plus a -1 disconnection check. Optimize Water Distribution hides the formula behind wells: add a virtual node 0 where building a well in a house becomes an edge from that house to the virtual source, and the choice between wells and pipes resolves itself in the sweep. Find Critical and Pseudo-Critical Edges runs the algorithm repeatedly, once excluding and once force-including each edge, and reads each edge's role off the resulting totals.

Past MSTs, the sort-then-union sweep is its own reusable pattern: any time events should merge groups in cost or time order, Kruskal's skeleton with a different sort key does the job.

Best Practices

Five habits, most of them about the union-find underneath:

Naive union-find lets parent chains grow into long tails, dragging every cycle check toward O(n). Path compression (pointing nodes at their grandparents while walking up) keeps finds effectively constant, and it costs one line.

Practice Problems

Two checks first: what does sharing a union-find root prove about two nodes, and why is every edge after union number V - 1 a guaranteed reject? Same problem set as Prim by design, so solve each one with both algorithms and feel where each shines:

AlgorithmsKruskal's Algorithm

Kruskal's Algorithm

A greedy algorithm that builds a minimum spanning tree by adding edges in order of weight.

AlgorithmsKruskal's Algorithm

Kruskal's Algorithm

A greedy algorithm that builds a minimum spanning tree by adding edges in order of weight.

Definition

Bridges Between Islands

Cycle Checks in Near-Constant Time

kruskal's_algorithm.py

def kruskal(n: int, edges: list[tuple[int, int, int]]) -> int:
    # edges hold (weight, node_a, node_b); returns MST weight or -1
    parent = list(range(n))

    def find(node: int) -> int:
        # Path compression: point at the grandparent while walking up
        while parent[node] != node:
            parent[node] = parent[parent[node]]
            node = parent[node]
        return node

    total = 0
    used = 0

    # Cheapest bridges first
    for weight, a, b in sorted(edges):
        root_a, root_b = find(a), find(b)

        # Same root means this edge would close a cycle
        if root_a == root_b:
            continue

        parent[root_a] = root_b
        total += weight
        used += 1

        # A spanning tree is complete at n - 1 edges
        if used == n - 1:
            return total

    return -1

Prim or Kruskal

Kruskal: Step-by-Step

Sort, sweep, union. The details that make it correct:

1. Sort Every Edge Once

The sort is the algorithm's entire cost ceiling: O(E log E). Everything after it is a single cheap pass, so resist any urge to re-sort, re-scan, or use a heap.

2. Start With V Single-Node Islands

Union-find begins with every node as its own parent. Unlike Prim's single growing tree, Kruskal's work-in-progress is a forest of fragments, and that's healthy, not broken.

3. Bridge Only Different Roots

For each edge, find both endpoints' roots. Different roots means the edge connects two separate fragments: take it. The same root means both ends already touch, and the edge would close a cycle.

4. Union the Fragments and Count

Point one root at the other and increment the edge counter. The counter doubles as a connectivity meter: every union means one fewer fragment in the forest.

5. Stop at V Minus One Bridges

A spanning tree holds exactly V - 1 edges, so the moment the counter gets there, every remaining edge is a guaranteed reject and the loop can exit early.

Common Patterns

Past MSTs, the sort-then-union sweep is its own reusable pattern: any time events should merge groups in cost or time order, Kruskal's skeleton with a different sort key does the job.

Best Practices

Five habits, most of them about the union-find underneath:

Practice Problems

Kruskal's Algorithm

A greedy algorithm that builds a minimum spanning tree by adding edges in order of weight.

kruskal's_algorithm.py

1. Sort Every Edge Once

2. Start With V Single-Node Islands

3. Bridge Only Different Roots

4. Union the Fragments and Count

5. Stop at V Minus One Bridges

Compress Paths or Pay Later

Read the Counter as a Verdict

Prefer It When Edges Arrive as a List

Expect the Same Total as Prim, Always

Keep the Forest Trick for Non-MST Problems

Kruskal's Algorithm

A greedy algorithm that builds a minimum spanning tree by adding edges in order of weight.

Kruskal's Algorithm

A greedy algorithm that builds a minimum spanning tree by adding edges in order of weight.

kruskal's_algorithm.py

1. Sort Every Edge Once

2. Start With V Single-Node Islands

3. Bridge Only Different Roots

4. Union the Fragments and Count

5. Stop at V Minus One Bridges

Compress Paths or Pay Later

Read the Counter as a Verdict

Prefer It When Edges Arrive as a List

Expect the Same Total as Prim, Always

Keep the Forest Trick for Non-MST Problems