The first time you got lost in a supermarket as a kid, the world probably turned huge and fluorescent. One minute you were beside the cereal, the next you were in a fog of paper towels, deciding whether to retrace your steps or keep walking until destiny intervened. That tiny drama is close to a problem your brain faces all day: use the map, or repeat what worked last time?
That is the split behind model-based and model-free learning. Model-based control is the map-reader: "If I do this, what happens next?" Model-free control is the diner regular: "I ordered this before, it was good, let us not make breakfast into a constitutional crisis."
The Two-Step Shuffle
In a new Cell Reports study, Weilun Ding and colleagues scanned 179 people with fMRI while they played a version of the classic two-step decision task. Participants make a choice, land in a second state, then get or miss a reward. Sometimes the path follows the usual route; sometimes it takes a weird side street.
A model-free learner mostly remembers whether the first choice paid off. Reward? Do it again. No reward? Maybe do something else. Simple, sturdy, emotionally satisfying.
A model-based learner watches the hidden structure. If a reward happened after an unusual transition, that person may realize the first choice was not actually the smart bet. The brain says, "Yes, I found my shoe in the freezer, but that does not make the freezer a shoe rack."
The Brain Region With Opinions
The ventromedial prefrontal cortex, or vmPFC, sits near the lower middle part of the frontal lobes and often shows up when researchers study subjective value. It is not a tiny accountant with a visor, but it gives that energy. It helps represent how much an option is worth right now.
Ding and colleagues found an asymmetry. Model-based value signals in the vmPFC tracked how much a person actually relied on model-based behavior. People who planned more showed stronger planning-flavored value signals. Meanwhile, model-free value signals appeared broadly across people, even when model-free influence was not especially obvious in behavior.
So habit-like valuation may be a common background hum, while planning-related valuation varies more sharply. The autopilot seems widely installed. The map app, apparently, has settings.
When the Map Fails to Load
The study also points to a possible reason some people show little model-based control: they may struggle to predict where actions lead. Among participants with weak model-based behavior and weak model-based vmPFC signals, the researchers found weaker state prediction error signals in the dorsolateral prefrontal cortex and intraparietal sulcus.
A prediction error is the brain's "wait, that was not what I ordered" signal. For rewards, it updates value. For states, it updates the map: I chose A, but I landed in B, so maybe the route is not what I thought.
If those state-prediction signals are faint, model-based planning has a bad morning. You cannot use a map well if the roads keep appearing through fog.
Why This Is More Than Lab Gymnastics
Many real-life problems involve getting stuck between habit and flexible control. Compulsive behavior, addiction, anxiety-driven avoidance, impulsivity, and some features of OCD all involve decisions that feel sensible in the moment but rigid over time. A 2023 Biological Psychiatry review argued that computational models may help target compulsive behavior by identifying which hidden decision process has gone sideways.
Recent research keeps pushing in that direction. A 2025 Trends in Cognitive Sciences review links impulsive behavior to overgeneralized model-based predictions, like treating yogurt as dangerous because milk once betrayed you. A 2025 Nature Communications study points to thalamic regulation of learning strategies across prefrontal-striatal networks. A 2026 Neuron paper shows that subgoals also carry neural value signals during model-based behavior.
Together, these papers make a larger point: "habit versus planning" is not a cartoon boxing match in the skull. It is a negotiation among brain systems, context, memory, and effort. Some days the planner has coffee. Some days habit grabs the wheel.
The Cautious Forecast
This study does not mean fMRI can diagnose your decision-making personality, and it definitely does not mean scientists can inspect your vmPFC and know why you bought that ridiculous kitchen gadget at 1 a.m. fMRI measures blood-oxygen signals, not tiny glowing thoughts.
But if these findings replicate and expand, they could help researchers understand why some people revise behavior while others get pulled toward old strategies. That could shape work on cognitive training, neuromodulation, and computational psychiatry. The goal would not be to delete habits. Without them, brushing your teeth requires a committee meeting.
The better goal is balance: knowing when to trust the old route, and when to pause, lift your head from the cereal aisle, and rebuild the map.
References
- Ding W, Cockburn J, Simon JP, et al. Model-based and model-free valuation signals in the human brain vary markedly in relation to individual differences in behavioral control. Cell Reports. 2026;45(6):117454. https://doi.org/10.1016/j.celrep.2026.117454
- Okan A, Hallquist MN. Negative affect-driven impulsivity as hierarchical model-based overgeneralization. Trends in Cognitive Sciences. 2025;29(5):407-420. https://doi.org/10.1016/j.tics.2025.01.002 PMCID: PMC12058388
- Grossman CD, Man V, O'Doherty JP. The representation and valuation of subgoals in the human brain during model-based hierarchical behavior. Neuron. 2026;114(7):1306-1320.e7. https://doi.org/10.1016/j.neuron.2025.12.023
- Wang BA, Wang MB, Lam NH, et al. Thalamic regulation of reinforcement learning strategies across prefrontal-striatal networks. Nature Communications. 2025;16:9095. https://doi.org/10.1038/s41467-025-63995-x PMCID: PMC12531340
- Kahnt T. Computationally informed interventions for targeting compulsive behaviors. Biological Psychiatry. 2023;93(8):729-738. https://doi.org/10.1016/j.biopsych.2022.08.028 PMCID: PMC9989040
Disclaimer: The image accompanying this article is for illustrative purposes only and does not depict actual experimental results, data, or biological mechanisms.