Every few months in AI, something impossible becomes a benchmark, then a baseline, then boring. AI research is the work behind that cycle: making machines do things they couldn't do last year. It's different from applying machine learning. Instead of using known techniques to solve a business problem, you're producing new knowledge: architectures, training methods, and results nobody has published yet.
This page follows one deliberate road through the field: research practice first, then the architectures behind modern AI, then reinforcement learning, ending at world models, the systems several major labs now bet will become the physics engines of robotics.
Research is a craft before it's a topic. You learn to read papers in passes instead of front to back, to reproduce results before trusting or extending them, and to design experiments where an ablation actually proves something. Benchmarks get a layer of suspicion of their own: by 2026 the classic ones are saturated and leaked into training data, so knowing what a score really measures is a research skill in itself.
A handful of architectural ideas carry nearly everything modern. Attention lets every token weigh every other token at once, which scaled into transformers and then into vision. Self-supervised learning removes the labels, while diffusion and flow matching turn noise into data. Each one earned its place by beating something older, and each is worth understanding from scratch.
Reinforcement learning is learning by trial and error: an agent acts, the world responds, and reward is the only teacher. The split that matters most here is model-free versus model-based: whether the agent maps the world first or learns purely from outcomes. That question leads directly to imagination training, where a policy trains inside a learned simulator instead of the real world.
The capstone, and the fastest-moving corner of the field. A world model is a network that predicts what happens next, given a state and an action. It's a physics engine learned from data instead of programmed. In one recent year, V-JEPA 2, Genie 3, Dreamer 4, and NVIDIA's Cosmos all landed, and robot policies like Pi0 started turning those predictions into real-world motion. This is where the previous three tiers were heading all along.
Here is the whole path, tier by tier. It's ordered so each tier earns the next, ending at world models. Each topic will get its own page soon, but until then, use this as the map.
Before any architecture, the meta-skills: how to read, reproduce, and measure. Most failed research dies here, not in the math.
The three-pass method: skim, study, then rebuild it in your head.
If you can't rebuild it, you don't understand it yet.
Ablations, baselines, and seeds to prove your idea is the reason.
What a score really measures, and when a leaderboard lies.
With the craft in place, move to the ideas themselves. Almost everything since 2017 is assembled from the pieces in this tier.
The idea behind modern AI: every token attends to every other.
Chop an image into patches and treat them like words.
Models that teach themselves from unlabeled data.
Generates by learning to undo noise, one step at a time.
A straighter road from noise to data, in far fewer steps.
The architectures above learn from data that already exists. Acting in a world that pushes back is a different problem, and this tier is trial and error made rigorous.
The core loop: an agent acts, the reward arrives, the policy updates.
React from experience, or learn a world you can plan inside.
The policy optimizers behind game agents and language models alike.
Let the policy practice inside the model's own dream.
Everything above converges here: an architecture that predicts, trained on experience, becomes a simulator the agent can rehearse in. The bet is that learned physics engines are how robots finally generalize.
Networks that predict what happens next: learned physics engines.
Compress the world to its essentials, then predict in that space.
Learning how the world moves from a million hours of video.
Genie, Cosmos, Dreamer: one model, many worlds.
From predicted futures to motor commands, fifty times a second.