10 Useful Learning Theories

Academic research that has helped me with Mathchops and tutoring

Aug 14, 2025

Here are 10 theories that have been really helpful in building Mathchops and working with students. If you already know about them, you might find it interesting to see how someone else tried to use them. And if you don’t, I think you’ll like them! I’ll provide links to books and articles I liked at the end.

I will start with the Zone of Proximal Development (ZPD), which I think of as “finding tasks that are at the level of the student.” Students learn faster when things are at the right level, and they enjoy it more, which helps them practice more, which helps them learn more. Think of a 6-year-old trying to learn how to shoot a basketball. A pro-sized basketball and 10-foot hoop won’t be productive – it will be too hard for the kid to make a shot. But with a smaller ball and shorter hoop, you can introduce basic shot mechanics, then work towards NBA conditions over time.

The same is true of math – questions are king. But finding the right questions is not easy. And that’s where Item Response Theory has been particularly helpful. I wrote a lot more about it here, but the idea is that you can design systems that rate both the questions and the students. If a question has been answered thousands of times by students with known ability rankings, you can get a very good idea of how difficult that question is. And if you have a pool of questions that are rated very precisely, you can use those ratings to estimate the abilities of students very precisely as well. Precise question rankings have allowed us to make accurate score predictions in Mathchops, which make the games much better for students.

Unfortunately, finding the right questions doesn’t matter if the students don’t learn how to answer them correctly. And that’s where learning a little cognitive science has been helpful. All of the terms I’m about to mention fit nicely into one model of how people learn, although I’m not aware of one universally-accepted name for the model.

At its center is the concept of working memory. It’s a sort of mental scratch pad with very limited capacity – there’s not much space on the pad, and anything you ‘write’ (think of, focus on) disappears after several seconds. If I ask you to remember the digits 2, 7, and 4, you probably won’t have any difficulty. But if I start to rattle off 50 digits, you probably won’t remember them all.

It is thought that we can only process a very limited number (maybe 4 or so) chunks of information at a time. But these chunks can take many, many different forms. They can be dance steps, melodies, speeches, parallel parking skills, smells…really anything at all that you can think of. And if you make sophisticated chunks, then you can do a lot more with your working memory. For example, if you’re just learning to sing Happy Birthday, you may have trouble focusing on anything else. But if you’ve sung it for 40+ years and are attending a 5 year-old’s birthday party, you can quite easily sing it and simultaneously think about whether it would be rude to check your fantasy football stats (this is completely hypothetical). That’s because the whole song is now one extremely efficient chunk – it can exist on the scratch pad while leaving lots of room for other chunks. Developing automaticity in this way allows you to work on more complex tasks. For example, a well-prepared SAT student can see the following question and draw upon chunks like completing the square, the Pythagorean Theorem, and graph translations.

But how do you create sophisticated, flexible chunks? How do you develop automaticity? Many of the best techniques are related to the concept of desirable difficulty, which refers to tasks that are irritating to the student in the moment but very helpful in the long term (and not so difficult that they are impossible to complete). You want something highly relevant and very difficult…but still doable.

One of my favorites is retrieval. The very act of attempting to recall something will help you remember it better, and the more difficult it is to recall it, the better you will remember it. For example, if I tell a student the definition of ambivalent and then ask her what the definition is two seconds later, she’ll probably remember (if she was listening!). If I ask again and again – five times in the span of one minute – I’ll probably get fired, but she will easily recall the definition.

But that’s not particularly useful. It would be much better to ask her a minute later, then ten minutes later, then later in the day, then the following day, then a few days later, etc. This spaced repetition will force the student to work a little harder to remember the definition, but she’ll ultimately remember the definition for much longer. If she also interleaves this practice with other work, working on tasks ABCDABCDABCD instead of AAABBBCCCDDD, she’ll learn all of these tasks much more quickly and durably.

Another term, borrowed from Ericsson’s studies of highly skilled performers, is deliberate practice. It’s the idea that you should actively seek out your weaknesses, analyze them, then practice repeatedly. This is probably the term that best captures what Mathchops is trying to do (it’s essentially a ‘deliberate practice’ app).

All of these concepts have deeply affected the design of Mathchops. For example, we don’t offer multiple choice math answers – you have to work hard to retrieve the answer (or the skills you need to solve the answer). Every score-related game is timed to encourage automaticity. Students never practice the same question multiple times in a row – spaced repetition and interleaving are baked into every game students play. And when a question does repeat, the numbers are different. Mathchops also provides immediate feedback after every retrieval attempt in the form of brief explanations, so that students understand (and don’t repeat) errors

I’m skipping over lots of nuances here. Discussions of the ZPD usually include scaffolding. I didn’t describe (nor do we rigorously implement) any of the math behind Item Response Theory. And I didn’t get into the specifics of interleaving – exactly how long do you spend on each topic? How large is each topic? How many do you tackle at the same time? People have written hundreds of pages on each of these concepts! But if you’re at all interested, I’m hoping this post will entice you to dive into the details. And if you already implement some of these concepts, please share what you do in the comments.

Mathchops’s Substack

Discussion about this post

Ready for more?

Mathchops’s Substack

10 Useful Learning Theories

Academic research that has helped me with Mathchops and tutoring

Further Reading:

Discussion about this post

Ready for more?