Binary search finds any value in a billion sorted elements in at most 30 looks, and every one of those looks leans on the same promise: the data is already in order. Real data almost never is. So before the fast search, before merging two datasets, before spotting duplicates in one pass, something has to put the elements in order first. A sorting algorithm is that something, a procedure that takes elements in any order and rearranges them into a defined one, usually smallest to largest.
And the sorts you learn in class aren't the ones your language actually runs. Python, Java, and JavaScript all ship hybrids built from classroom pieces, and C++ ships a quicksort that knows when it's losing and swaps strategies mid-sort. This page builds those pieces from scratch: insertion sort, the O(n²) baseline, then merge sort and quicksort, the two recursive workhorses that reach O(n log n) by giving up different things, and finally the hybrids themselves.
Sorting is really a whole family of algorithms, and each member makes a different trade between speed, memory, and order preservation. Every card below jumps to the section that covers that algorithm:
The sort you'd invent with a hand of cards. Quadratic in general, but unbeatable on input that's already nearly in order.
The other two classroom sorts: one scans for the minimum, one swaps neighbors. Both quadratic, neither ships anywhere.
Split down to single elements, then merge sorted halves. O(n log n) on every input, stable, and the bill is O(n) of scratch space.
Partition around a pivot and recurse on both sides. In place and fast on average, with a famous worst case.
Pull the max off a heap until nothing's left. Never quadratic, no extra memory, and almost always the backup plan.
Skip comparisons entirely. Tally how often each key appears and read the sorted result off the tally in O(n + k).
The merge sort that exploits the order your data already has. It's what Python and JavaScript actually run.
A quicksort that watches its own recursion depth and calls in heapsort before the worst case lands. C++'s std::sort.
Before building a single sort, you need the two words every comparison between sorts turns on: stable and in place. They sound like the same kind of virtue, and they're completely independent. Stable means equal elements keep their original relative order. Sort a spreadsheet of orders by date, then sort it again by customer with a stable sort, and each customer's orders stay in date order, since the second sort never undoes the first. In place means the sort works inside the input array with no more than O(log n) of bookkeeping, no second copy of the data. Merge sort is stable but not in place, quicksort is in place but not stable, and little insertion sort is both. Each algorithm below keeps one or both of these properties, and which one it keeps is usually why it gets picked.
You might think some undiscovered comparison sort is out there waiting to beat O(n log n) everywhere. There's a clean argument that none ever will. An array of n distinct elements can arrive in n! different orders, and a correct sort has to treat each order differently. Every comparison answers one yes or no question, which at best cuts the remaining possibilities in half, so distinguishing all 8! = 40,320 orderings of 8 elements takes at least 16 comparisons in the worst case, and in general at least log2(n!) of them, which grows as n log n. Any algorithm claiming better would be identifying one ordering out of n! without asking enough questions.
The escape hatch hides in the fine print: the bound only constrains sorts that gather information by comparing elements. Counting sort, further down this page, reads keys directly and beats the limit whenever keys are small integers.
Now here's the whole field side by side. The rest of the page walks each row, and the asterisk marks an average case:
| Operations | Time | Space |
|---|---|---|
| Insertion Sort | O(n²) | O(1) |
| Selection Sort | O(n²) | O(1) |
| Bubble Sort | O(n²) | O(1) |
| Merge Sort | O(n log n) | O(n) |
| Quicksort | O(n log n)* | O(log n) |
| Heapsort | O(n log n) | O(1) |
| Counting Sort | O(n + k) | O(k) |
Hand someone a messy pile of numbered cards and most people invent insertion sort on the spot: pick up the next card, slide it left past every bigger card, and drop it where it fits. The array version works the same way. The left side of the array becomes a sorted region, and each new element walks left through it until it finds its slot.
Watch it sort [5, 2, 4, 1]. The 2 walks past the 5 to give [2, 5, 4, 1], the 4 stops in front of the 5 for [2, 4, 5, 1], and the 1 walks past all three to finish [1, 2, 4, 5]. Five shifts in total, which sounds cheap. But the card analogy hides where the cost lives: cards slide aside for free, while an array has to copy every larger element one slot to the right to open the gap. Reverse-sorted input forces every element to walk the full distance, and at 1,000 elements that's 499,500 shifts. That triangle of work is what O(n²) is naming.
Now predict the opposite extreme before reading the code: how many shifts does insertion sort spend on input that's already sorted? The answer is zero: every element glances at its left neighbor once, sees nothing bigger, and stays put, so the whole pass costs O(n). Hold onto that fact, because it's the reason this slow sort survives inside every fast one.
def insertion_sort(nums: list[int]) -> list[int]:
# Walk the unsorted part one element at a time
for i in range(1, len(nums)):
current = nums[i]
j = i - 1
# Shift larger elements right to open a slot
while j >= 0 and nums[j] > current:
nums[j + 1] = nums[j]
j -= 1
# Drop the element into its open slot
nums[j + 1] = current
return numsTwo more quadratic sorts share the classroom with insertion sort, and both lose to it in practice. Selection sort scans the whole unsorted region for the minimum, swaps it into place, and repeats. That sounds thrifty, but the scan ignores any order the data already has, so it costs the full O(n²) even on sorted input, and the long-range swap can fling an element past its equals, so the standard version isn't stable either.
def selection_sort(nums: list[int]) -> list[int]:
n = len(nums)
for i in range(n - 1):
# Find the smallest element in the unsorted region
smallest = i
for j in range(i + 1, n):
if nums[j] < nums[smallest]:
smallest = j
# Swap it to the front of the unsorted region
nums[i], nums[smallest] = nums[smallest], nums[i]
return numsBubble sort compares neighbors and swaps every out-of-order pair, pass after pass, until a full pass finds nothing to do. The largest value "bubbles" to the end each round. Quick check: sorting the reversed [4, 3, 2, 1], how many swaps does it make in total? Every pair starts out of order, so all 3 + 2 + 1 = 6 of them, the same triangle of work insertion sort does in shifts. It's the sort everyone learns and nobody ships.
def bubble_sort(nums: list[int]) -> list[int]:
n = len(nums)
while n > 1:
swapped = False
for i in range(1, n):
# Swap any out-of-order neighbors
if nums[i - 1] > nums[i]:
nums[i - 1], nums[i] = nums[i], nums[i - 1]
swapped = True
# The largest value has bubbled into its final slot
n -= 1
# A full pass with no swaps means sorted
if not swapped:
break
return numsInsertion sort collapses at scale because elements travel one slot at a time. Merge sort never moves an element one slot at a time, and the whole algorithm is really just two observations stacked: an array with one element is already sorted, and two sorted halves can be merged into one sorted array in a single pass.
Take [38, 27, 43, 3, 9, 82, 10]. Split it into [38, 27, 43, 3] and [9, 82, 10], split those again, and keep going until every piece holds a single element. Then the merging starts and the pieces build back up: the left side becomes [3, 27, 38, 43] and the right becomes [9, 10, 82]. One merge remains, and each of its steps compares the front of one half against the front of the other and takes the smaller. Predict before reading on: which half supplies the first three elements of the final answer?
The left half gives up the 3, then the right half gives up both the 9 and the 10 before the left's 27 gets a turn. The comparing continues until the left half empties at [3, 9, 10, 27, 38, 43], and the leftover 82 is copied straight across. That copy step matters more than it looks: when one half runs out, the rest of the other half still has to come over, and forgetting it is the classic merge sort bug, because elements silently vanish from the output.
Now count the cost. Splitting produces about log n levels (the 7 elements above split 3 levels deep), and every level touches all n elements during its merges, so the total is O(n log n) on every input, sorted, reversed, or shuffled. The price is memory: a merge needs somewhere to lay elements down, so merge sort carries O(n) of scratch space. There's a quiet bonus in the tie-break, too. When both fronts are equal, the code takes from the left half, so equal elements never jump past each other: merge sort is stable.
def merge_sort(nums: list[int]) -> list[int]:
# Base case: one element is already sorted
if len(nums) <= 1:
return nums
# Split the array down the middle
mid = len(nums) // 2
left = merge_sort(nums[:mid])
right = merge_sort(nums[mid:])
# Merge the two sorted halves
merged = []
i = j = 0
while i < len(left) and j < len(right):
# Take from the left on ties to stay stable
if left[i] <= right[j]:
merged.append(left[i])
i += 1
else:
merged.append(right[j])
j += 1
# One side ran out: copy the leftovers of the other
merged.extend(left[i:])
merged.extend(right[j:])
return mergedMerge sort buys its guarantee with O(n) of extra memory. Quicksort keeps the divide-and-conquer shape but does all of its work inside the array, and its engine is the partition: pick one element, call it the pivot, and ask every other element a single question, is it smaller than the pivot or bigger?
Run that on [7, 2, 1, 6, 8, 5, 3, 4] with the last element, the 4, as the pivot. A boundary starts off the left edge and marks where the small side ends. The scan walks the array and swaps every element that's at most 4 behind the boundary: first the 2, then the 1, then the 3. When the scan ends, the pivot swaps into the slot just past the boundary, and the array reads [2, 1, 3, 4, 8, 5, 7, 6]. Look at what one pass bought. The 4 now sits at index 3, which is its final sorted position, everything to its left is smaller, and everything to its right is bigger. The 4 never moves again. Quicksort recurses on [2, 1, 3] and [8, 5, 7, 6] separately, and every partition plants one more element in its final home.
def quick_sort(nums: list[int], low: int = 0, high: int | None = None) -> list[int]:
# Default to sorting the whole array
if high is None:
high = len(nums) - 1
if low < high:
# Partition, planting the pivot at its final index
pivot_index = partition(nums, low, high)
# Recurse on each side, pivot excluded
quick_sort(nums, low, pivot_index - 1)
quick_sort(nums, pivot_index + 1, high)
return nums
def partition(nums: list[int], low: int, high: int) -> int:
# The last element is the pivot
pivot = nums[high]
# Boundary marks the end of the smaller-or-equal region
i = low - 1
for j in range(low, high):
# Swap smaller elements behind the boundary
if nums[j] <= pivot:
i += 1
nums[i], nums[j] = nums[j], nums[i]
# Plant the pivot just past the boundary
nums[i + 1], nums[high] = nums[high], nums[i + 1]
return i + 1This one-scan version is called Lomuto partitioning, and it has one sharp edge worth respecting: the recursive calls must leave the pivot's index out. The pivot is the one element guaranteed to be placed, and excluding it is what makes every call strictly smaller than the last.
You might think an already sorted array is the easy case, since the work looks finished. Hand [1, 2, 3, 4, 5, 6, 7, 8] to this exact code and the opposite happens: the pivot 8 is the maximum, so the partition piles all 7 remaining elements on one side and nothing on the other. The next call inherits 7 elements and peels off one more, and the comparisons stack up as 7 + 6 + 5 + 4 + 3 + 2 + 1 = 28 for 8 elements. Sorted input is naive quicksort's O(n²) worst case, and at a million already sorted elements that's roughly 500 billion comparisons for zero useful work. The standard fixes are choosing the pivot at random or taking the median of the first, middle, and last elements, which is why quicksort's time is always quoted as O(n log n) average. What it gives up for running in place is stability: partition swaps fling equal elements across the whole array with no regard for who came first.
Quicksort rises or falls on the partition pass, so here it is broken into its five jobs:
nums[high], and the whole pass is defined relative to it: every other element gets compared against the pivot exactly once.i starts at low - 1, one slot before anything. As j scans from low to high - 1, every element at most the pivot gets swapped to i + 1, so the front of the array stays a clean block of small values.i + 1 is the first position of the bigger side, so swapping the pivot there drops it exactly between the two regions. That index is finished and never gets visited again.low through pivotIndex - 1, then on pivotIndex + 1 through high. The base case is a range of one or zero elements, which is sorted by definition.O(n²) staircase. Randomizing the pivot makes every input behave like the average case, since no ordering can reliably set up the trap.Heapsort is the workhorse nobody picks first and everybody keeps around. It treats the array itself as a max heap, a tree packed into the array where every parent beats its children. Building the heap costs O(n), and then the sort is one move repeated: swap the root, which is the largest remaining value, into the last unsorted slot, shrink the heap by one, and sift the new root down until the heap property holds again. Each sift costs O(log n), so the whole sort is a guaranteed O(n log n), in place, with no input able to push it quadratic.
Watch it on [5, 9, 7]. Heapified, the array becomes [9, 5, 7]. Swap the 9 into the last slot for [7, 5, 9], where the 9 is now finished, then swap the new root 7 into the second-to-last slot, and [5, 7, 9] is sorted without a byte of extra memory. So why isn't heapsort the default everywhere? The answer is memory access: a sift jumps from index i to 2i + 1 and onward, bouncing across the array, while quicksort's partition streams through neighbors. Caches reward the streamer, so quicksort wins the races in practice, and heapsort earns its keep as the guarantee other sorts fall back on.
def heap_sort(nums: list[int]) -> list[int]:
n = len(nums)
# Build a max heap: sift down every parent, bottom up
for i in range(n // 2 - 1, -1, -1):
sift_down(nums, i, n)
# Swap the root (max) into the last unsorted slot, shrink, repair
for end in range(n - 1, 0, -1):
nums[0], nums[end] = nums[end], nums[0]
sift_down(nums, 0, end)
return nums
def sift_down(nums: list[int], root: int, size: int) -> None:
while True:
largest = root
left, right = 2 * root + 1, 2 * root + 2
# Pick the largest of parent and children
if left < size and nums[left] > nums[largest]:
largest = left
if right < size and nums[right] > nums[largest]:
largest = right
# Heap property holds: done
if largest == root:
return
# Swap down and keep sifting
nums[root], nums[largest] = nums[largest], nums[root]
root = largestEvery sort above learns by comparing, so the O(n log n) speed limit applies to all of them. Counting sort steps around the limit by never comparing anything. When keys are small integers, like exam scores from 0 to 100, it builds a 101 slot tally, walks the data once adding 1 to tally[score] for each element, and then reads the sorted result straight off the tally: write score 0 exactly tally[0] times, score 1 exactly tally[1] times, and so on. That's O(n + k) for n elements and k possible keys, so a million scores sort in about a million steps.
def counting_sort(nums: list[int], max_value: int) -> list[int]:
# Tally how many times each key appears
tally = [0] * (max_value + 1)
for num in nums:
tally[num] += 1
# Read the sorted result straight off the tally
result = []
for value, count in enumerate(tally):
result.extend([value] * count)
return resultThe catch is the k. A tally over all 32-bit integers would need about 4 billion slots, so the trick only fits when the key range is small. Radix sort stretches it: sort by the last digit with a stable counting pass, then by the next digit, and so on, so numbers up to 10 digits long sort in 10 passes. Both stay specialist tools, integer-like keys only, with O(k) extra memory as the rent.
Timsort starts from one observation about real data: it's almost never random. Log files, database rows, and half-edited lists all arrive with stretches already in order. So instead of splitting blindly down the middle like merge sort, Timsort walks the array hunting for runs, stretches that already ascend (or descend, which it reverses in place). Each run becomes a merge unit, runs shorter than a minimum length (32 to 64 elements) get extended with insertion sort, the specialist for small nearly-sorted pieces, and then the runs merge pairwise with the same stable, left-takes-ties merge from merge sort.
Two refinements make it quick in practice. The merge schedule is chosen carefully, since Python 3.11 by a policy called powersort, so runs of similar size meet each other instead of a giant run swallowing a tiny one over and over. And when one run keeps winning a merge, Timsort shifts into galloping mode, leaping ahead with a binary search instead of comparing one element at a time. That design pays off at both extremes: on data that's already sorted, the run hunter finds a single run and the whole sort finishes in O(n), while on random data it's a normal stable O(n log n) merge sort. It's never worse than merge sort and dramatically better on lived-in data, which is why Python runs it for list.sort(), why V8 adopted it for JavaScript arrays, and why Java trusts it with objects.
MIN_RUN = 32
# Simplified Timsort: fixed-size runs and pairwise merges.
# Real implementations detect natural runs and gallop.
def timsort(nums: list[int]) -> list[int]:
n = len(nums)
# Insertion sort each MIN_RUN-sized chunk in place
for start in range(0, n, MIN_RUN):
end = min(start + MIN_RUN, n)
insertion_sort_range(nums, start, end)
# Merge sorted runs, doubling the run width each round
width = MIN_RUN
while width < n:
for left in range(0, n, 2 * width):
mid = min(left + width, n)
right = min(left + 2 * width, n)
if mid < right:
merge_range(nums, left, mid, right)
width *= 2
return nums
def insertion_sort_range(nums: list[int], start: int, end: int) -> None:
for i in range(start + 1, end):
current = nums[i]
j = i - 1
while j >= start and nums[j] > current:
nums[j + 1] = nums[j]
j -= 1
nums[j + 1] = current
def merge_range(nums: list[int], left: int, mid: int, right: int) -> None:
# Copy both runs, then merge back into place
left_run = nums[left:mid]
right_run = nums[mid:right]
i = j = 0
k = left
while i < len(left_run) and j < len(right_run):
# Take from the left on ties to stay stable
if left_run[i] <= right_run[j]:
nums[k] = left_run[i]
i += 1
else:
nums[k] = right_run[j]
j += 1
k += 1
# Copy the leftovers
while i < len(left_run):
nums[k] = left_run[i]
i += 1
k += 1
while j < len(right_run):
nums[k] = right_run[j]
j += 1
k += 1Introsort answers quicksort's one open weakness by standing guard over it. It begins as an ordinary quicksort, fast partitions and friendly cache behavior, but it counts its own recursion depth as it goes. A healthy quicksort on n elements bottoms out in about log2(n) levels, so when the depth passes roughly 2 log2(n), the signature of a worst case unfolding, introsort hands the misbehaving range to heapsort mid-sort. Heapsort is slower per element, but its O(n log n) ceiling is unconditional, so the hybrid inherits the guarantee while keeping quicksort's speed on inputs that behave, which is nearly all of them.
There's a third gear at the bottom: ranges below about 16 elements skip the partitioning machinery entirely and run insertion sort, which wins on tiny inputs. David Musser described the design in 1997, and it has been C++'s std::sort ever since: quicksort's speed, heapsort's guarantee, insertion sort's small-input efficiency, and the worst case of none of them.
import math
# The dispatcher is the whole idea. It reuses partition() from the
# Quicksort block, insertion_sort_range() from the Timsort block,
# and heap_sort() from the Heapsort block.
def introsort(nums: list[int]) -> list[int]:
if len(nums) > 1:
max_depth = 2 * int(math.log2(len(nums)))
introsort_range(nums, 0, len(nums) - 1, max_depth)
return nums
def introsort_range(nums: list[int], low: int, high: int, depth_left: int) -> None:
size = high - low + 1
# Tiny range: insertion sort wins
if size <= 16:
insertion_sort_range(nums, low, high + 1)
return
# Depth exhausted: a worst case is unfolding, switch to heapsort
if depth_left == 0:
heap_sort_range(nums, low, high + 1)
return
# Healthy range: one ordinary quicksort step
pivot_index = partition(nums, low, high)
introsort_range(nums, low, pivot_index - 1, depth_left - 1)
introsort_range(nums, pivot_index + 1, high, depth_left - 1)
def heap_sort_range(nums: list[int], start: int, end: int) -> None:
# Real introsort heapsorts the range in place; copying keeps this short
nums[start:end] = heap_sort(nums[start:end])Between them, the two hybrids above cover most of the ecosystem: Timsort behind Python and JavaScript, introsort behind C++. Java is the one worth a closer look, because it ships two different sorts, and the reason teaches the stability lesson all by itself. Sorting primitives like int[] gets a dual-pivot quicksort, a variant by Vladimir Yaroslavskiy that partitions around two pivots into three regions per pass. Why is an unstable sort acceptable there? The answer is that two equal ints are indistinguishable: there's no original order between them worth preserving, so stability stops being observable. The moment elements carry extra fields someone might have presorted, Java switches to Timsort, where stability is guaranteed.
A few footnotes complete the map. JavaScript's stability is written into the language itself: since ES2019 the spec requires Array.prototype.sort to be stable. And C++ keeps the unstable introsort as std::sort while selling stability separately as std::stable_sort, which pays merge sort's memory bill. So the answer to which sort is best is the one the libraries settled on years ago: all of them, stitched together so each one covers another's weak spot.
Sorting shows up in interviews two ways: as a tool you reach for, and as internals you get quizzed on. These habits cover both:
O(n log n) up front to create that structure is almost always a fair trade.First, a two-question check on this page. Why is already sorted input the worst case for a last-element-pivot quicksort, and which property does each workhorse trade away, merge sort on one side and quicksort on the other? If neither one makes you hesitate, the mechanics stuck. Interview problems rarely ask you to implement a sort, though. They reward the sort-first instinct: when a problem involves overlaps, distances, rankings, or duplicates, sorting usually turns a global mess into a local one where the answer sits between neighbors.
Completed | Title | Status | Notes | Solution | |
|---|---|---|---|---|---|
Easy | |||||
Easy | |||||
Medium | |||||
Medium | |||||
Medium | |||||
Medium | |||||
Hard | |||||
Hard | |||||
Hard |