Try this: keep your eyes on one spot and still notice the thing creeping in from the side. You do it all the time. Driving, reading a room, pretending not to care who just walked in. Your brain runs this little stunt constantly, and for years the standard story was that some dedicated attention machinery must be pulling levers behind the wall. Reasonable theory. Very tidy. The brain, naturally, chose chaos instead.
The paper by Sudhanshu Srivastava, William Yang Wang, and Miguel P. Eckstein asks a sneaky question: what if covert attention - paying attention without moving your eyes - does not need a special built-in attention module at all? In their 2025 PNAS study, the authors trained ordinary feedforward convolutional neural networks on a visual cueing task and then cracked the models open unit by unit, like suspicious plumbing under a sink, to see what was going on inside (Srivastava et al., 2025).
The headline is simple and a little rude to decades of neat theories: these CNNs showed classic behavioral signatures of covert attention even though nobody installed an explicit attention mechanism. No spotlight. No magical control knob. No tiny foreman in a hard hat yelling "boost the left side."
Instead, attention-like behavior seemed to emerge from the network learning the task well.
Not a spotlight - more like a patch job that works
Covert attention usually gets explained with metaphors like spotlights, zoom lenses, or resource allocation. Those metaphors are useful, but also suspiciously clean for something built out of wet tissue and electrical gossip. This new paper suggests that some of the benefits of covert attention may come from distributed interactions across many units, not from a single dedicated "attention circuit" barking orders.
Inside the networks, early layers contained units tuned separately to the cue or the target. Later layers developed joint tuning, where the cue increasingly shaped how the target got represented. That part already smells a lot like biology. Then the authors identified four mechanisms by which the cue improved target sensitivity. One resembled a Bayesian ideal observer - basically weighting information by where the cue says the target probably is. The other three were messier and more interesting: opponency across locations, mixed summation-opponency, and interactions with the humble ReLU threshold. Yes, even the glorified on-off switch got a role. The brain loves hiring weird subcontractors.
The wild bit is that the team then reanalyzed mouse superior colliculus recordings and found neuron types the CNN had predicted but earlier analyses had not emphasized, including cue-inhibitory and location-opponent cells. That is the part where this stops being "AI does a neat trick" and starts becoming "AI may have handed neuroscience a flashlight and a crowbar."
Why this matters outside a conference room
If these results hold up, they push against the idea that covert attention always requires a bespoke top-down controller. That matters because attention is one of those brain functions everybody talks about like it is a single pipe, when it is probably more like a leaking basement renovation from 1938. Different species show attention-like behavior. Different tasks recruit different circuitry. The new work fits with a growing view that attention may rely on both shared and distinct mechanisms depending on the system and job at hand (Xia et al., 2024; DeYoe et al., 2024).
It also connects nicely to earlier work from the same group showing that plain feedforward CNNs can reproduce major covert attention effects in classic cueing and search tasks without any explicit attention block bolted on (Srivastava et al., 2024). In other words, this paper is the sequel where they stop admiring the magic trick and finally frisk the magician.
There is a practical angle too. If you want machine vision systems that handle clutter, uncertainty, or peripheral information more like animals do, this kind of result says you might not always need increasingly baroque attention gadgets. Sometimes task optimization plus the right architecture already grows useful selection strategies on its own. That could matter for robotics, autonomous driving, and any system that has to make decent decisions in messy scenes where everything is competing to be the main character.
The catch, because there is always a catch
Nobody should now sprint into the street yelling that CNNs have solved attention. These are still simplified models, trained on narrow tasks, and biological brains have feedback loops, neuromodulators, body states, and enough recurrent circuitry to make a clean engineer cry into their coffee. Other recent work on covert attention points to rich dynamics across cortical hierarchies and distinctions between covert and presaccadic attention that simple feedforward models do not capture well (Bartsch et al., 2023; Li et al., 2021).
Still, the paper lands a sharp point: maybe attention-like behavior is not always a deluxe add-on. Maybe it can emerge when a system gets very, very good at using information that predicts where something important is likely to appear. That idea is less glamorous than a mystical spotlight in the skull, but honestly it sounds more like real biology - improvised, layered, and held together with evolutionary duct tape.
Which is comforting, in a way. Your brain may not be a pristine command center. It may just be a deeply overworked construction site that learned, against all odds, how to notice trouble before you look straight at it.
References
Srivastava S, Wang WY, Eckstein MP. Emergent neuronal mechanisms mediating covert attention in convolutional neural networks. Proc Natl Acad Sci U S A. 2025;122(46):e2411909122. DOI: https://doi.org/10.1073/pnas.2411909122
Srivastava S, Wang WY, Eckstein MP. Emergent human-like covert attention in feedforward convolutional neural networks. Current Biology. 2024;34(3):579-593.e12. DOI: https://doi.org/10.1016/j.cub.2023.12.058
Xia R, Chen X, Engel TA, Moore T. Common and distinct neural mechanisms of attention. Trends Cogn Sci. 2024;28(6):554-567. DOI: https://doi.org/10.1016/j.tics.2024.01.005. PMCID: https://pmc.ncbi.nlm.nih.gov/articles/PMC11153008/
DeYoe EA, Huddleston W, Greenberg AS. Are neuronal mechanisms of attention universal across human sensory and motor brain maps? Psychon Bull Rev. 2024;31(6):2371-2389. DOI: https://doi.org/10.3758/s13423-024-02495-3. PMCID: https://pmc.ncbi.nlm.nih.gov/articles/PMC11680640/
Bartsch MV, Merkel C, Strumpf H, Schoenfeld MA, Tsotsos JK, Hopf JM. A cortical zoom-in operation underlies covert shifts of visual spatial attention. Sci Adv. 2023;9(10):eade7996. DOI: https://doi.org/10.1126/sciadv.ade7996. PMCID: https://pmc.ncbi.nlm.nih.gov/articles/PMC9995033/
Li HH, Hanning NM, Carrasco M. To look or not to look: dissociating presaccadic and covert spatial attention. Trends Neurosci. 2021;44(8):653-664. DOI: https://doi.org/10.1016/j.tins.2021.05.002. PMCID: https://pmc.ncbi.nlm.nih.gov/articles/PMC8552810/
Disclaimer: The image accompanying this article is for illustrative purposes only and does not depict actual experimental results, data, or biological mechanisms.