Code that runs fine on your laptop is one problem. The same code serving ten million users is a different one, because at that scale everything becomes a distributed system, where machines fail and networks drop messages. System design is the practice of deciding how servers, databases, caches, and networks fit together so software keeps working as it grows.
Where algorithms ask how to compute something efficiently, system design asks how to keep it running when a server dies mid-request. The components aren't exotic, the skill is in the tradeoffs: every choice buys you something and costs you something else.
Every design conversation starts with the same three questions: what is the machine actually doing, how is the application structured, and what does this system need to deliver? This tier covers how computers and applications are put together before any scaling enters the picture, plus the back-of-the-envelope math that catches impossible designs before you build them.
Everything in a distributed system travels over a network, so you need to know how data actually moves: how machines find each other through DNS, and what the transport layer guarantees you with TCP or trades away for speed with UDP.
An API is the contract between client and server: what you can ask for and what you get back. This tier covers HTTP, the protocol nearly every API speaks, the WebSocket connections that let a server push without being asked, and the paradigms and design choices that keep old callers working as the API grows.
The fastest request is the one you never make twice. Caches keep recently used data in fast storage so repeated reads skip the database, and CDNs apply the same idea to geography by parking copies of files near your users. Storing the copy is the easy part, the hard part is knowing when it's gone stale.
Between clients and your servers sits a layer of middlemen. Load balancers spread traffic so no single server drowns, gateways handle auth and routing at the front door, rate limiters keep one noisy client from ruining things for everyone, and consistent hashing decides which server owns what without reshuffling on every change.
Most scaling problems end up being database problems. This tier starts with the SQL or NoSQL question, which by now is less a war and more a menu. From there it covers how data survives growth: indexes keep queries fast, replication keeps copies alive, sharding splits what no longer fits on one machine, and the CAP theorem caps what you can promise.
Some work is too big or too slow for the request-response loop. Message queues and pub/sub systems move that work off to the side to run on its own schedule, while MapReduce and stream processing chew through datasets no single machine could handle.
The last tier zooms out to the shape of the whole system. Do you build one deployable app or many small services? Do services call each other directly or react to events? And when a cluster of machines has to agree on a single truth while some of them are failing, consensus algorithms like Raft do the deciding.
Here is the whole path, tier by tier. Each topic will get its own page with diagrams and walkthroughs soon. For now, use this as a map of how the pieces stack.
One machine, one app, and the napkin math that says what they can handle. Everything later in this path exists because some number here runs out.
What CPU, RAM, and disk each do, and why the speed gaps matter.
The anatomy of a deployed app, from client to server to storage.
Pinning down what a system must do before deciding how.
Quick estimates that tell you if a design is even plausible.
The first number to run out forces a second machine, and now data has to travel. DNS finds the other side, and TCP or UDP decides what the trip guarantees.
IP addresses, ports, and how machines find each other.
Reliable delivery or raw speed: the transport layer's tradeoff.
The phonebook that turns domain names into IP addresses.
Connected machines still need a shared language. These define the contract: what a client may ask, what it gets back, and how that promise survives growth.
The request-response protocol behind nearly everything on the web.
A two-way connection for when the server needs to push.
REST, GraphQL, and gRPC, and when each one earns its keep.
Interfaces that survive growth without breaking their callers.
Once requests are cheap to make, you get a flood of identical ones. Answering repeats from a fast copy saves the database, but the new problem is staleness.
Answers repeated reads from fast memory instead of the database.
Cache-aside, write-through, write-back, and when copies go stale.
Serves static files from servers near the user.
Caches absorb the repeats, but unique traffic still has to land somewhere. A front layer spreads it across servers, screens it, and decides who owns what.
Middlemen that route, shield, and spread traffic across servers.
One front door handling auth, routing, and TLS for everything behind it.
Moves almost nothing when servers join or leave the ring.
Caps request rates so one noisy client can't sink the system.
Behind every cache miss waits the database, which is where scaling pressure eventually concentrates. Keeping data fast, replicated, and partitioned is its own discipline, complete with an impossibility theorem.
Tables, joins, and transactions with hard guarantees.
Flexible schemas built to spread across machines.
The data structures that turn full scans into instant lookups.
Copies data across machines for speed and survival.
Splits data across machines when one is no longer enough.
When the network splits, you keep consistency or availability, not both.
Cheap, bottomless storage for files, backups, and blobs.
Some work should never happen inside a request at all. Hand it to a queue and process it elsewhere, in scheduled batches or as a stream that never stops.
Lets services hand off work without waiting for each other.
Broadcasts events to every subscriber, with a log you can replay.
Splits a huge job across many machines, then merges the results.
Crunch data in scheduled chunks or process it as it arrives.
Every piece so far is a component. These last topics decide how the components get arranged: one app or many services, direct calls or events, and a cluster that keeps agreeing while parts of it fail.
One deployable app or many small services, each at a price.
Services react to events instead of calling each other directly.
How a cluster of unreliable machines agrees on one truth.