By tracking feedback during tasks, the anterior cingulate cortex notices when a new step has become necessary and signals the motor cortex to adjust.
Life is full of processes to learn and then relearn when they become more elaborate. One day you log in to an app with just a password, the next day you also need a code texted to you. One day you can just pop your favorite microwavable lunch into the oven for six straight minutes, but then the packaging changes and you have to cook it for three minutes, stir, and then heat it for three more.
Our brains need a way to keep up. A new study by neuroscientists at The Picower Institute for Learning and Memory at MIT reveals some of the circuitry that helps a mammalian brain learn to add steps.
In Nature Communications the scientists report that when they changed the rules of a task, requiring rats to adjust from performing just one step to performing two, a pair of regions on the brain’s surface, or cortex, collaborated to update that understanding and change the rats’ behavior to fit the new regime. The anterior cingulate cortex (ACC) appeared to recognize when the rats weren’t doing enough and updated cells in the motor cortex (M2) to adjust the task behavior.
“I started this project about seven or eight years ago when I wanted to study decision-making,” says Daigo Takeuchi, a researcher at the University of Tokyo who led the work as a postdoc at the RIKEN-MIT Laboratory for Neural Circuit Genetics at the Picower Institute, directed by senior author and Picower professor Susumu Tonegawa. “New studies were finding a role for M2. I wanted to study what upstream circuits were influencing this.”
Tripping up the second step
Takeuchi and Tonegawa traced neural circuit connections that led into M2 and found that many originated in the ACC. They began to see the ACC’s role in guiding M2’s sequential decisions when they instilled a genetic manipulation in ACC cells that allowed them to suppress their activity. This “chemogenetic” disabling of the ACC had a particular effect. When the task rules changed so that instead of having to poke their snout into just one hole to gain a little reward, rats had to poke their nose into a sequence of two holes, the rodents with silenced ACCs took much longer to realize the rule change. Compared to rats with normal ACC activity, they failed for much longer to realize the second poke was necessary. Rats had no trouble, however, going from two steps back to just one, regardless of whether their ACC was silenced.
When the scientists chemogenetically silenced the ACC cells’ terminals in M2, they got the same results as silencing the ACC overall. They also silenced other areas of the cortex, but doing that didn’t affect the ability of the rats to notice and adjust to the rule switch. Together these manipulations confirmed that it was specifically the ACC’s connections with M2 that help the rats notice and adjust to the one-step-to-two-step change.
But what effect does the ACC have in M2? Takeuchi and his co-authors measured the electrical activity of cells in M2 as the rats played their nose-poking, rule-changing game. They found that many cells were particularly activated by different task rules (i.e., one-step or two-steps). When they silenced the ACC, though, that suppressed this rule selectivity.
Within M2, Takeuchi and the team also noticed populations of neurons that responded preferentially to positive outcomes (reward for doing the task right) and negative outcomes (not getting a reward for doing the task wrong). They found that when they silenced the ACC, this actually increased the activity of the negative-outcome encoding neurons during negative feedback, particularly for the first 10-20 rounds after the rules changed from one step to two. This correlated strongly with the timing, or “epoch,” of the rats’ worst performance.
“It seems likely the epoch-specific disruption of animals’ second-choice performance is associated with the excessive enhancement of the activity of negative outcome activated neurons caused by the ACC silencing,” they wrote in the study.
The team further confirmed that the feedback, or outcomes, stage mattered by using a different technique to silence the ACC. By engineering ACC neurons to be suppressed by flashes of light (a technique called “optogenetics”) they could precisely control when the ACC went offline. They found that if they did so after the rats made an incorrect choice when the rules switched from one poke to two, they could cause the rats to continue to err. Optogenetic silencing of the ACC after rats made a correct choice didn’t undermine their subsequent behavior.
“These results indicate that ACC neurons process error feedback information following an erroneous second response and use this information to adjust the animal’s sequential choice responses in subsequent trials,” they wrote.
Too high a threshold
The evidence painted a clear picture: When the rats needed to notice that an extra step was now required, the ACC’s job was to learn from negative feedback and signal M2 to take the second step. If the ACC wasn’t available when feedback was provided, then M2 cells that emphasize negative outcomes apparently would become especially active and the rats would fail to do the required second step for a time before finally catching on.
Why would less ACC activity somehow increase the negative outcome encoding cells’ activity in M2? Takeuchi hypothesizes that what the ACC is actually doing is stimulating inhibitory cells in M2 that normally modulate the activity of those cells. With ACC activity reduced, the negative outcome encoding M2 cells experience less inhibition. The behavioral result, he theorizes, is that the rats therefore require more evidence than they should of the rule change. The mechanism isn’t completely clear, Takeuchi acknowledged, but the rats apparently need more time to experience outcome feedback from making the right decision of taking a second step before they’ll become convinced that they are on the right track doing so.
Takeuchi says that while the results demonstrate the circuit necessary for adapting to a rule change requiring more steps in a process, it also raises some interesting new questions. Is there another circuit for noticing when a multi-step process has become a one-step process? If so, is that circuit integrated with the one discussed in this study? And if the threshold model is the right one, how exactly is it working?
The implications not only matter for understanding the neural basis of natural sequential decisions, but might also for AI applications ranging from game playing or industrial work, each of which can involve tasks with multiple steps.
Written by David Orenstein