Connect every node of a weighted graph using the least total edge weight, and you're building a minimum spanning tree: exactly V - 1 edges, no cycles, every node touched, minimum total cost. The search space is enormous (a complete graph on just 10 nodes hides 100,000,000 spanning trees), and Prim's algorithm cuts through all of it greedily: grow one connected tree outward from any starting node, always absorbing the cheapest edge that crosses from inside the tree to outside. With a heap serving the cheapest crossing edge, the whole build runs in O(E log V).
Unlike most greedy algorithms, this one's proof obligation comes pre-paid by a single theorem that also powers its sibling Kruskal, and it's worth meeting before the mechanics.
Split a graph's nodes into any two non-empty groups, and call every edge with one end in each group a crossing edge. The cut property says the lightest crossing edge always belongs to a minimum spanning tree, and the proof is one swap. Suppose some MST skipped that lightest edge. Adding it creates a cycle, the cycle must cross the split somewhere else over a heavier edge, and trading the heavy crossing for the light one yields a cheaper spanning tree. A cheaper tree than the minimum is a contradiction, so the lightest crossing edge was always safe to take.
Prim's entire strategy is applying that theorem to one specific split, over and over: the tree built so far versus everything else. Each absorption redraws the cut, the theorem blesses the next cheapest crossing, and V - 1 blessings later the tree is provably minimal. No backtracking, no second-guessing, just one theorem on repeat.
Trace it on five nodes with edges A-B = 2, A-D = 6, B-C = 3, B-D = 8, B-E = 5, C-E = 7, D-E = 9. Starting at A, the crossing edges are A-B and A-D, so the tree absorbs B for 2. From {A, B} the cheapest crossing is B-C at 3. Now the tree is {A, B, C}, and the crossings are A-D at 6, B-D at 8, B-E at 5, and C-E at 7. Predict the next absorption.
E joins through B-E at 5, and the final round takes A-D at 6: four edges, total 2 + 3 + 5 + 6 = 16, and no other spanning tree of this graph beats it. Notice what the blob never did: it never started a second island, and it never reconsidered an edge. One connected tree, growing outward, cheapest crossing first.
import heapq
def prim(graph: dict[int, list[tuple[int, int]]], start: int) -> int:
# graph[node] holds (neighbor, weight) pairs; returns MST weight or -1
visited = set()
heap = [(0, start)]
total = 0
while heap:
weight, node = heapq.heappop(heap)
# Lazy heap: skip nodes the tree already swallowed
if node in visited:
continue
visited.add(node)
total += weight
# Offer every edge crossing from the tree to the outside
for neighbor, edge_weight in graph[node]:
if neighbor not in visited:
heapq.heappush(heap, (edge_weight, neighbor))
# Fewer than all nodes reached means the graph is disconnected
return total if len(visited) == len(graph) else -1You might think the minimum spanning tree also gives you cheap routes between nodes, since it connected everything as cheaply as possible. It optimizes the wrong number for that. Take a triangle with A-B = 3, A-C = 3, B-C = 4. The MST keeps both edges that cost 3 and drops B-C, minimizing total wire at 6. But now traveling from B to C through the tree costs 6, against the direct edge's 4 that got thrown away.
Total construction cost and point-to-point travel cost are different objectives with different algorithms: the MST for the first, Dijkstra for the second. Interview questions about "minimum cost to connect" want the tree. Questions about "cheapest path from X to Y" never do.
0 and move on.(weight, node) entries for edges leading from inside the tree to outside it. Each newly absorbed node offers all its outward edges to the heap.V nodes has exactly V - 1 edges and touches every node. Finishing with fewer visited nodes means the graph was never connected, and the honest answer is "impossible".Prim shines when the graph is dense. Min Cost to Connect All Points is the signature case: every pair of points is implicitly connected by its Manhattan distance, making a complete graph where Kruskal would have to sort n(n - 1) / 2 edges while Prim just grows the blob. The general selection rule inside the MST family: adjacency structure and density favor Prim, a flat edge list and sparsity favor Kruskal, and both produce the same total weight every time.
Recognition from the problem statement is straightforward: "connect all X for minimum cost" with no source or destination mentioned. The moment a specific start and end appear, the problem left MST territory.
The mechanics are Dijkstra-adjacent, and so are the pitfalls:
Two checks first: what does the cut property promise about the lightest crossing edge, and why does the visited check belong at the pop rather than the push? The MST problem set is shared territory with Kruskal, so solve each one both ways:
Completed | Title | Status | Notes | Solution | |
|---|---|---|---|---|---|
Medium |