You can do worse on every practice session…and still learn faster overall. A study in math found students who mixed problem types remembered about twice as much a week later. So why does practicing in a way that feels wrong and messy end up giving you stronger, more flexible skills?
And this “doing it wrong on purpose” isn’t just a quirky study result—it shows up across wildly different fields. In one classic experiment, athletes who trained badminton serves in a constantly changing order performed about 30 % better when tested later than those who drilled the same serve again and again, even though they looked worse during practice. A 2018 review of 61 studies found a similar pattern: when people mix what they work on and vary how they do it, delayed performance jumps by roughly 15 % on average. Tech companies quietly exploit this too: Duolingo’s own data suggest that algorithmically shuffling material can boost lesson completion by around 12 %. In other words, many top performers and products are already betting on interleaving and variation—long before it feels comfortable.
But our instincts still push us toward the opposite: neat blocks, full focus on one thing, and the comfort of “getting it right” before moving on. School reinforces this—chapter 3 today, chapter 4 next week, 20 exercises of the same type in between. Work does too: one feature, one tool, one meeting, all in a row. The result is that you *feel* fluent in the moment, then stall or blank when the context changes. Interleaving and variation flip this script by forcing you to constantly retrieve, compare, and adapt, even when the material feels 10–20 % “too hard” during practice.
When you switch from one type of problem or skill to another, your brain has to answer three questions every time:
1. **What is this?** (discriminate the type) 2. **What do I know that fits this?** (select a method) 3. **How do I adjust that method here?** (adapt to details)
Blocked work mostly skips step 1: if you’re on “Chapter 5: Quadratic Equations,” you already know what kind of question is coming. Interleaving removes that hint. That extra discrimination step is the “desirable difficulty” that initially slows you down but builds the mapping between *situations* and *strategies*.
In Rohrer-style math setups, this means a student might solve 40 problems in this order:
- 10 fraction problems (all one type) - 10 proportion problems - 10 area problems - 10 linear equations
Now compare that to 40 problems where every 3–4 questions jump to a different concept, plus occasional old material from prior weeks. The total quantity is identical—40 reps—but the *decision load* is very different. You might get 30/40 correct in the blocked set and only 24/40 when mixed. Yet one week later, the mixed group often outperforms by 20–25 percentage points because they practiced noticing *what* they were facing, not just *how* to execute a memorized recipe.
Variation adds another layer: you keep the underlying idea the same but change surface features. A programmer, for example, might deliberately implement the same algorithm:
- Once in Python with lists - Once in Java with arrays - Once in JavaScript in an async context
All three share the same conceptual skeleton, but each version forces you to re-encode the idea, not just reuse the last keystrokes. Do this 5–6 times across different environments and you often gain the kind of transfer that 50 near-identical drills don’t create.
This is why high error rates during these sessions are not a bug. In many motor and cognitive studies, learners who hovered around 60–75 % accuracy while constantly being “stretched” ended up with far more robust skills than those cruising at 90–100 % on predictable tasks.
Your goal isn’t to look smooth today; it’s to build a library of “if this, then that” connections your future self can access under pressure, when the problem doesn’t politely announce what chapter it came from.
A concrete way to use this: say you’re learning front‑end development. Instead of 60 React state problems in a row, you could rotate every 5–7 tasks:
- 5 styling bugs in CSS - 5 React state bugs - 5 DOM event issues in vanilla JS …then repeat that 3–4 times with different codebases. Across a week, you still do 60–80 problems, but each block forces you to re-identify “what kind of issue is this?” before touching the keyboard.
Or take music: rather than 30 minutes on *only* scales, divide a 45‑minute session into 3 x 15:
- 15 minutes: scales in 3 keys, varied rhythms - 15 minutes: one tricky passage, but change tempo and dynamics every run - 15 minutes: sight‑read 4 short pieces in different styles
One more angle from medicine: some residency programs now train diagnostic skills by shuffling 10–12 patient cases from unrelated specialties, instead of 10 cardiology cases in a row. Residents initially feel slower but later catch rare conditions more reliably.
As tools track moment‑to‑moment struggle, they’ll be able to nudge difficulty toward that 60–75 % success “sweet spot” instead of guessing. Picture a reskilling platform where a wind‑turbine tech rotates through 8–10 brief simulations per shift, each pulling from different fault types, and the system reorders them when accuracy rises. Expect hiring tests, coding bootcamps, and even flight simulators to embed this by default, so “easy mode” quietly disappears.
Your challenge this week: switch 30–50 % of your study into “mixed mode.” If you solve 20 coding tasks, force at least 8 to jump across topics. Track two numbers: today’s score and your score on the *same* set 72 hours later. If your delayed score rises by even 10–20 %, you’ve found a repeatable way to compound every hour you invest.

