Here is the hard part about writing this one. The study is built on conditional value at risk, Bayes-adaptive Markov decision processes, and hazard functions, which is roughly the vocabulary you would use to scare someone away from a dinner party. None of those phrases want to be friendly. But underneath the machinery is a question so plain it almost hurts: why is one creature cautious and another one bold when they are looking at the exact same strange thing? That question is worth a little jargon. So let's keep the math and lose the dread.
The Setup: A Mouse Walks Into a Room
A mouse is dropped into an open arena. In the middle sits a Lego brick, or something equally novel and equally suspicious. The mouse has a decision to make, and it is the oldest decision in biology.
Approach the new thing. It might be food. It might be a friend. It might be a snake.
This is the explore-exploit tradeoff, except with the volume turned up, because novelty is not just uncertain, it is potentially lethal. A wealth of behavioral work has documented how rodents thread this needle, often with a tactic called intermittent approach: dart in, retreat, wait, dart in again (Bhattacharya et al., 2020). It is the animal version of texting someone back three days later to seem chill.
What nobody had cleanly explained is the variation. Some mice commit. Some mice never do. Tingke Shen and Peter Dayan, writing in eLife, decided to model exactly that.
Three Knobs in the Mouse's Head
The authors built a computational mouse and gave it three dials to turn.
The first is a hazard function, which is the mouse's running estimate of how likely the object is to hurt it. Crucially, this estimate updates. Every second the Lego brick fails to eat the mouse is evidence, slowly logged, that the Lego brick is probably fine.
The second is an intrinsic reward, the itch to explore for its own sake. Curiosity, rendered as a number. The brain pays you a small bonus just for looking.
The third is the interesting one: conditional value at risk, or CVaR. Borrowed from finance, where it measures how much money you could lose on your worst days, CVaR here measures how heavily an animal weights its worst-case outcomes. A risk-neutral mouse plans around the average. A risk-averse mouse plans around the disaster (Wang et al., 2023). Same arena, same brick, completely different inner accountant.
That last word, "tail," in the paper's title is doing quiet work. It refers to the tail of the probability distribution, the slim unlikely edge where the bad thing lives. Some animals can't stop staring at the tail.
Two Kinds of Mouse
Fit the model to 26 real animals freely exploring, and the population splits into two recognizable characters.
One group starts cautious, gathers a little evidence that nothing terrible is happening, and then upgrades to confident, full approach. They want the reward and they go get it. In the model's terms, they are closer to risk-neutral, and their hazard prior is flexible. They can be talked out of their fear by data.
The other group approaches carefully and stays that way. Forever. They show what the authors call self-censoring, a cautious tail-behind posture that never relaxes into commitment. Their hazard prior is high and stubborn. No amount of uneventful sniffing convinces them the brick is safe. They have decided, and they are not taking questions.
The model reproduced not just these broad strokes but the fine-grained idiosyncrasies, the frequency and duration of each nervous little bout. The point is that you do not need a different theory for the brave mouse and the nervous mouse. You need the same theory with the dials set differently.
Why This Is More Than Mouse Trivia
A mouse that cannot update its threat estimate, that plans its whole life around the worst case and never lets evidence move it, is describing something uncomfortably close to an anxiety disorder. The clinical version of "the brick is probably fine but I will be giving it a wide berth indefinitely" has a name, and a lot of people live inside it.
What this work offers is a way to take a fuzzy clinical impression and break it into separate, measurable parts (Shen and Dayan, 2025). Is the problem an inflated sense of danger, a refusal to update that sense, or a basic over-weighting of catastrophe? Those are three different knobs, and in principle three different things to treat. A diagnosis stops being one heavy word and becomes a small set of numbers you could actually aim at.
The mice, for their part, have no idea they were running Bayesian inference the whole time. They just wanted to know about the brick. Some of them still do.
Disclaimer: The image accompanying this article is for illustrative purposes only and does not depict actual experimental results, data, or biological mechanisms.
References
-
Shen T, Dayan P. (2025). Individual differences in tail risk sensitive exploration using Bayes-adaptive Markov decision processes. eLife. DOI: 10.7554/eLife.100366. PMID: 41324354.
-
Bhattacharya A, et al. (2020). To Approach or Avoid: An Introductory Overview of the Study of Anxiety Using Rodent Assays. Frontiers in Behavioral Neuroscience. PMCID: PMC7479238.
-
Wang K, et al. (2023). Near-Minimax-Optimal Risk-Sensitive Reinforcement Learning with CVaR. Proceedings of the 40th International Conference on Machine Learning. arXiv: 2302.03201.
-
Ying C, et al. (2022). Towards Safe Reinforcement Learning via Constraining Conditional Value-at-Risk. arXiv: 2206.04436.