Socks before shoes, shirt before jacket, but socks versus shirt can go either way. Getting dressed is a dependency graph, and a topological sort is a linear ordering of a directed graph's nodes where every edge points forward, so nothing ever comes before something it depends on. Build systems compile modules in topological order, package managers install dependencies with it, and spreadsheets recalculate cells by it.
The one requirement is that the graph is a DAG. A cycle makes ordering impossible (each member waits on another forever), and the algorithm below doesn't just require acyclicity, it detects the violation for free, which is why "can these courses be finished" and "order these courses" are the same computation.
Kahn's algorithm runs on a countdown. Count each node's in-degree (how many prerequisites point at it), queue up everything at zero, and repeat: release a node into the output, subtract one from each of its dependents, and enqueue any dependent that just hit zero. Trace it on six courses with prerequisites 0→1, 0→2, 1→3, 2→3, 3→4, 3→5. The in-degrees read [0, 1, 1, 2, 1, 1], so only course 0 starts unblocked. Predict what its release unlocks.
Both 1 and 2 drop to zero and join the queue. Taking lowest-first, the full release order runs 0, 1, 2, 3, 4, 5, six nodes processed, six nodes total, valid order in hand. Now add the edge 4→1 and watch the cycle detection work: nodes 1, 3, and 4 wait on each other in a loop, none ever reaches in-degree zero, and the queue dies after releasing just {0, 2}. Two processed out of six is the whole cycle proof: no extra machinery, just a count that came up short.
from collections import deque
def topological_sort(n: int, edges: list[tuple[int, int]]) -> list[int]:
# edges hold (before, after) pairs; returns an order, or [] on a cycle
graph = {node: [] for node in range(n)}
in_degree = [0] * n
for before, after in edges:
graph[before].append(after)
in_degree[after] += 1
# Anything with no prerequisites can go first
queue = deque(node for node in range(n) if in_degree[node] == 0)
order = []
while queue:
node = queue.popleft()
order.append(node)
# Releasing a node unlocks whatever waited only on it
for neighbor in graph[node]:
in_degree[neighbor] -= 1
if in_degree[neighbor] == 0:
queue.append(neighbor)
# Fewer than n processed means a cycle held the rest hostage
return order if len(order) == n else []The second route to the same order uses depth-first search: run DFS, and push each node onto a stack at the moment it finishes, after all its descendants are fully explored. Reversing that stack yields a topological order, because a node only finishes once everything depending on it has already finished, putting dependencies late in finish order and early in reversed order.
Two details earn respect here. The reversal is mandatory, since raw finish order is dependency order backward. And cycle detection needs three node states rather than a boolean: meeting a node that's still on the current recursion path is a back edge and proves a cycle, while meeting a finished node is routine. In practice Kahn's is easier to get right under pressure, and its layer-by-layer structure answers more follow-up questions.
You might think a graph has the topological order. The example above has several: 1 and 2 share no edge, so they swap freely, and so do 4 and 5. Any pair of adjacent output nodes without an edge between them can trade places and stay valid. The order is unique only when the graph contains a path visiting every node, which forces every consecutive pair into sequence.
That looseness is a feature. The swappable nodes are exactly the tasks that can run in parallel, and processing Kahn's queue in BFS-style layers groups them: every node in a layer has all prerequisites satisfied by earlier layers. Course-scheduling problems that ask for the minimum number of semesters are asking for the layer count.
0 enters the queue immediately. A DAG always has at least one (follow arrows backward and you must run out), which is what gets the loop moving.0 just had its last prerequisite satisfied and joins the queue.n nodes means a valid order exists and you're holding it. Anything less means the leftovers form or feed a cycle, since none of them ever reached zero.Course Schedule asks only whether an order exists (the processed-count check), and Course Schedule II wants the order itself, same code, different return. Alien Dictionary hides the graph: compare adjacent words, and the first differing character pair is an edge in the unknown alphabet, after which the ordering is a plain topological sort. Parallel Courses III runs the sort with a clock, setting each course's finish time to its duration plus the latest prerequisite finish, which is the critical-path idea that DAG shortest paths builds on.
One lookalike deserves a flag: Minimum Height Trees peels leaves off an undirected tree layer by layer, which feels like Kahn's with degree 1 instead of in-degree 0. The resemblance is real and the mechanics transfer, but it's a peeling pattern on an undirected graph, not a topological sort, and conflating them muddies both.
The algorithm is short, and these five details decide whether it's correct:
prerequisites[i] = [a, b] means b comes before a, so the edge runs b → a. Flipping it produces orderings that look plausible and run everything backward, so check direction before anything else.Two checks first: why does a DAG always have at least one node with in-degree zero, and what does a processed count below n prove? The set runs feasibility, then ordering, then the hidden-graph variants:
Completed | Title | Status | Notes | Solution | |
|---|---|---|---|---|---|
Medium | |||||
Medium | |||||
Hard |