AlgorithmsPrim's Algorithm

Prim's Algorithm

A greedy algorithm for finding the minimum spanning tree of a weighted graph.

Definition

Connect every node of a weighted graph using the least total edge weight, and you're building a minimum spanning tree: exactly V - 1 edges, no cycles, every node touched, minimum total cost. The search space is enormous (a complete graph on just 10 nodes hides 100,000,000 spanning trees), and Prim's algorithm cuts through all of it greedily: grow one connected tree outward from any starting node, always absorbing the cheapest edge that crosses from inside the tree to outside. With a heap serving the cheapest crossing edge, the whole build runs in O(E log V).

Unlike most greedy algorithms, this one's proof obligation comes pre-paid by a single theorem that also powers its sibling Kruskal, and it's worth meeting before the mechanics.

The Cut Property

Split a graph's nodes into any two non-empty groups, and call every edge with one end in each group a crossing edge. The cut property says the lightest crossing edge always belongs to a minimum spanning tree, and the proof is one swap. Suppose some MST skipped that lightest edge. Adding it creates a cycle, the cycle must cross the split somewhere else over a heavier edge, and trading the heavy crossing for the light one yields a cheaper spanning tree. A cheaper tree than the minimum is a contradiction, so the lightest crossing edge was always safe to take.

Prim's entire strategy is applying that theorem to one specific split, over and over: the tree built so far versus everything else. Each absorption redraws the cut, the theorem blesses the next cheapest crossing, and V - 1 blessings later the tree is provably minimal. No backtracking, no second-guessing, just one theorem on repeat.

Growing the Blob

Trace it on five nodes with edges A-B = 2, A-D = 6, B-C = 3, B-D = 8, B-E = 5, C-E = 7, D-E = 9. Starting at A, the crossing edges are A-B and A-D, so the tree absorbs B for 2. From {A, B} the cheapest crossing is B-C at 3. Now the tree is {A, B, C}, and the crossings are A-D at 6, B-D at 8, B-E at 5, and C-E at 7. Predict the next absorption.

E joins through B-E at 5, and the final round takes A-D at 6: four edges, total 2 + 3 + 5 + 6 = 16, and no other spanning tree of this graph beats it. Notice what the blob never did: it never started a second island, and it never reconsidered an edge. One connected tree, growing outward, cheapest crossing first.

prim's_algorithm.py

import heapq

def prim(graph: dict[int, list[tuple[int, int]]], start: int) -> int:
    # graph[node] holds (neighbor, weight) pairs; returns MST weight or -1
    visited = set()
    heap = [(0, start)]
    total = 0

    while heap:
        weight, node = heapq.heappop(heap)

        # Lazy heap: skip nodes the tree already swallowed
        if node in visited:
            continue

        visited.add(node)
        total += weight

        # Offer every edge crossing from the tree to the outside
        for neighbor, edge_weight in graph[node]:
            if neighbor not in visited:
                heapq.heappush(heap, (edge_weight, neighbor))

    # Fewer than all nodes reached means the graph is disconnected
    return total if len(visited) == len(graph) else -1

What an MST Is Not

You might think the minimum spanning tree also gives you cheap routes between nodes, since it connected everything as cheaply as possible. It optimizes the wrong number for that. Take a triangle with A-B = 3, A-C = 3, B-C = 4. The MST keeps both edges that cost 3 and drops B-C, minimizing total wire at 6. But now traveling from B to C through the tree costs 6, against the direct edge's 4 that got thrown away.

Total construction cost and point-to-point travel cost are different objectives with different algorithms: the MST for the first, Dijkstra for the second. Interview questions about "minimum cost to connect" want the tree. Questions about "cheapest path from X to Y" never do.

Prim: Step-by-Step

1. Seed the Tree With Any Node

The starting node is genuinely arbitrary: the cut property holds for whatever cut the growing tree defines, so every start produces a minimum spanning tree. Grab node 0 and move on.

2. Track Every Edge Crossing the Frontier

A min-heap holds (weight, node) entries for edges leading from inside the tree to outside it. Each newly absorbed node offers all its outward edges to the heap.

3. Absorb Along the Cheapest Crossing

Pop the lightest crossing edge and annex the node on its far end. The cut property guarantees this edge belongs to some minimum spanning tree, so the greed never needs revisiting.

4. Skip Edges Whose Far End Got Absorbed

Lazy heaps keep stale entries: by the time an edge pops, its target may already be inside the tree. The visited check at the pop is mandatory, or nodes get counted twice.

5. Count Nodes to Catch Disconnection

A spanning tree of V nodes has exactly V - 1 edges and touches every node. Finishing with fewer visited nodes means the graph was never connected, and the honest answer is "impossible".

Common Patterns

Prim shines when the graph is dense. Min Cost to Connect All Points is the signature case: every pair of points is implicitly connected by its Manhattan distance, making a complete graph where Kruskal would have to sort n(n - 1) / 2 edges while Prim just grows the blob. The general selection rule inside the MST family: adjacency structure and density favor Prim, a flat edge list and sparsity favor Kruskal, and both produce the same total weight every time.

Recognition from the problem statement is straightforward: "connect all X for minimum cost" with no source or destination mentioned. The moment a specific start and end appear, the problem left MST territory.

Best Practices

The mechanics are Dijkstra-adjacent, and so are the pitfalls:

Checking only before pushing isn't enough: two frontier nodes can both offer edges to the same outside node, and both entries enter the heap. Whichever pops second must be discarded, or the "tree" gains a cycle and the total double-counts.

Practice Problems

Two checks first: what does the cut property promise about the lightest crossing edge, and why does the visited check belong at the pop rather than the push? The MST problem set is shared territory with Kruskal, so solve each one both ways:

AlgorithmsPrim's Algorithm

Prim's Algorithm

A greedy algorithm for finding the minimum spanning tree of a weighted graph.

AlgorithmsPrim's Algorithm

Prim's Algorithm

A greedy algorithm for finding the minimum spanning tree of a weighted graph.

Definition

Unlike most greedy algorithms, this one's proof obligation comes pre-paid by a single theorem that also powers its sibling Kruskal, and it's worth meeting before the mechanics.

The Cut Property

Growing the Blob

prim's_algorithm.py

import heapq

def prim(graph: dict[int, list[tuple[int, int]]], start: int) -> int:
    # graph[node] holds (neighbor, weight) pairs; returns MST weight or -1
    visited = set()
    heap = [(0, start)]
    total = 0

    while heap:
        weight, node = heapq.heappop(heap)

        # Lazy heap: skip nodes the tree already swallowed
        if node in visited:
            continue

        visited.add(node)
        total += weight

        # Offer every edge crossing from the tree to the outside
        for neighbor, edge_weight in graph[node]:
            if neighbor not in visited:
                heapq.heappush(heap, (edge_weight, neighbor))

    # Fewer than all nodes reached means the graph is disconnected
    return total if len(visited) == len(graph) else -1

What an MST Is Not

Prim: Step-by-Step

1. Seed the Tree With Any Node

The starting node is genuinely arbitrary: the cut property holds for whatever cut the growing tree defines, so every start produces a minimum spanning tree. Grab node 0 and move on.

2. Track Every Edge Crossing the Frontier

A min-heap holds (weight, node) entries for edges leading from inside the tree to outside it. Each newly absorbed node offers all its outward edges to the heap.

3. Absorb Along the Cheapest Crossing

Pop the lightest crossing edge and annex the node on its far end. The cut property guarantees this edge belongs to some minimum spanning tree, so the greed never needs revisiting.

4. Skip Edges Whose Far End Got Absorbed

Lazy heaps keep stale entries: by the time an edge pops, its target may already be inside the tree. The visited check at the pop is mandatory, or nodes get counted twice.

5. Count Nodes to Catch Disconnection

A spanning tree of V nodes has exactly V - 1 edges and touches every node. Finishing with fewer visited nodes means the graph was never connected, and the honest answer is "impossible".

Common Patterns

Best Practices

The mechanics are Dijkstra-adjacent, and so are the pitfalls:

Practice Problems

Prim's Algorithm

A greedy algorithm for finding the minimum spanning tree of a weighted graph.

prim's_algorithm.py

1. Seed the Tree With Any Node

2. Track Every Edge Crossing the Frontier

3. Absorb Along the Cheapest Crossing

4. Skip Edges Whose Far End Got Absorbed

5. Count Nodes to Catch Disconnection

Verify the Visited Check Sits at the Pop

Go Dist-Array on Dense Graphs

Remember It Only Speaks Undirected

Expect Ties to Produce Different Trees

Build the Complete Graph Lazily

Prim's Algorithm

A greedy algorithm for finding the minimum spanning tree of a weighted graph.

Prim's Algorithm

A greedy algorithm for finding the minimum spanning tree of a weighted graph.

prim's_algorithm.py

1. Seed the Tree With Any Node

2. Track Every Edge Crossing the Frontier

3. Absorb Along the Cheapest Crossing

4. Skip Edges Whose Far End Got Absorbed

5. Count Nodes to Catch Disconnection

Verify the Visited Check Sits at the Pop

Go Dist-Array on Dense Graphs

Remember It Only Speaks Undirected

Expect Ties to Produce Different Trees

Build the Complete Graph Lazily