Scientists have introduced a new AI approach that involves habitual and goal-directed behaviors learning to support each other. Using computer simulations that imitated exploring a maze, the method was able to quickly adjust to changing environments and replicate the behavior of humans and animals once they became familiar with a specific environment. This study not only sets the stage for the development of AI systems that can adapt rapidly and consistently, but also offers insight into decision-making in neuroscience and psychology.
Both living creatures and AI systems could benefit from this innovative method.
Researchers from the Okinawa Institute of Science and Technology (OIST) and Microsoft Research Asia in Shanghai have found that machines powered by algorithms and AI need to be able to react quickly and adapt to different situations. In the field of psychology and neuroscience, behavior is often divided into two categories – habitual (fast and simple, but rigid) and goal-directed (flexible, but complex and slower). Daniel Kahneman, a Nobel Prize winner in Economic Sciences, refers to these as System 1 and System 2. However, there is still a debate on whether they are independent and conflicting or mutually supportive components.We have proposed a new AI method where systems of habitual and goal-directed behaviors learn to support each other. The method quickly adapts to changing environments and reproduces the behavior of humans and animals after they have been accustomed to a certain environment for a long time, as shown in computer simulations that mimic the exploration of a maze.
The study, published in Nature Communications, not only paves the way for the development of systems that adapt quickly and reliably in the growing field of AI, but also provides clues to how we make decisions in the fields of neuroscience and psychology.Researchers developed a model that combines the habitual and goal-directed systems for learning behavior in AI agents that use reinforcement learning, a learning method that relies on rewards and punishments. This model is based on the theory of “active inference,” which has gained a lot of attention recently. The researchers created a computer simulation that imitates a task where mice navigate a maze using visual cues and are rewarded with food when they reach the goal.
They studied how these two systems adjust and combine while interacting with the environment, and found that they can quickly achieve adaptive behavior. It was noted that theAI agent utilized reinforcement learning to gather data and enhance its own performance.
What our brains prefer
Once the workday is over, we often go home on autopilot, relying on habitual behavior. However, if you’ve recently moved and aren’t paying attention, you might unintentionally drive to your old house out of habit. When you realize this, you switch to goal-directed behavior and redirect to your new home. Traditionally, these two types of behavior were thought to operate independently, with habitual behavior being fast but inflexible, and goal-directed behavior being flexible but slow.”The shift from goal-directed to habitual behavior in learning is a well-known discovery in psychology. Our model and simulations provide an explanation for this phenomenon: the brain favors behaviors that offer greater certainty. As learning advances, habitual behavior becomes less unpredictable, increasing certainty. As a result, the brain tends to rely on habitual behavior after extensive training,” Dr. Dongqi Han, a former PhD student at OIST’s Cognitive Neurorobotics Research Unit and the primary author of the paper, stated.
For a new goal that AI has not been trained for, it utilizes an internal process.
The model uses a combination of habitual behaviors to plan its actions, making the planning process more efficient as it does not need to consider all possible actions. This is in contrast to traditional AI approaches, which require all possible goals to be explicitly included in training in order to be achieved. In this model, each desired goal can be achieved without explicit training but by flexibly combining learned knowledge.
“It’s important to achieve a kind of balance or trade-off between flexible and habitual behavior,” stated Prof. Jun Tani, head of the Cognitive Neurorobotics Research Unit. “There could be many possible”
There are multiple ways to achieve a goal, but considering all possible actions is costly, so goal-directed behavior is limited by habitual behavior to narrow down options.
Improving AI Technology
Dr. Han became interested in neuroscience and the disparity between artificial and human intelligence while working on AI algorithms. “I began contemplating how AI can exhibit more efficient and adaptable behavior, similar to humans. I wanted to comprehend the underlying mathematical principles and how we can utilize them to enhance AI. That was the driving force behind my PhD research.”
Understanding the contrast between habitual and goal-directed behavior.Goal-directed behaviors are particularly important in neuroscience as they can provide insight into neurological disorders like ADHD, OCD, and Parkinson’s disease. “We are investigating the computational principles that govern the collaboration of multiple brain systems. We have also observed that neuromodulators like dopamine and serotonin are essential in this process,” said Prof. Kenji Doya, head of the Neural Computation Unit. “AI systems inspired by the brain and capable of solving real-world problems can be valuable tools in understanding brain activity.”
Dr. Han is interested in creating AI that can behave more like humans in order to achieve complex goals. He wants the AI to have abilities similar to humans when it comes to completing everyday tasks. He believes that our brains have two learning mechanisms that work together, and he wants to better understand how they work in order to achieve their goal.