“Shifting Tides: The Oceanic Plate’s Break from the Arabian and Eurasian Continents”

An international research team has investigated the influence of the forces exerted by the Zagros Mountains in the Kurdistan region of Iraq on how much the surface of the Earth has bent over the last 20 million years. Their research revealed that in the present day, deep below the Earth's surface, the Neotethys oceanic plate
HomeTechnologyRevolutionary Training Method Enhances AI Performance in Unpredictable Environments

Revolutionary Training Method Enhances AI Performance in Unpredictable Environments

AI agents that are trained in simulated environments that differ from their actual deployment places sometimes show better performance than those trained in the same settings, according to recent research.

For instance, a home robot designed to handle domestic chores in a factory setting might struggle to wash the sink or take out the trash effectively when tasked in a real household kitchen, as it’s operating in a new and unfamiliar environment.

To mitigate such issues, engineers typically strive to align the simulated training settings as closely as possible to the real-world locations where the agents will function.

However, a team of researchers from MIT and other institutions recently discovered that, contrary to common belief, training in a wholly different environment can sometimes result in a more capable artificial intelligence agent.

Their findings suggest that, in certain cases, training an AI in a more predictable, less chaotic environment allows it to outperform a rival AI trained within the same unpredictable conditions of the testing environment.

This unexpected occurrence has been labeled the indoor training effect by the researchers.

“If we learn tennis in an indoor setting devoid of disruptions, we might master various strokes more easily. When we transition to a noisier setting, like an outdoor court with wind, we could perform better than if we had trained in that challenging environment from the start,” explains Serena Bono, a research assistant at the MIT Media Lab and lead author of a paper discussing this effect.

The team investigated this phenomenon by having AI agents play modified Atari games that included an element of randomness. They were surprised to notice that the indoor training effect was consistently observable across different Atari games and variations.

The researchers aim for these findings to spark further exploration into developing improved training strategies for AI agents.

“This presents an entirely new perspective. Instead of striving to conform the training and testing environments, there may be potential to create simulation environments where AI agents learn even more effectively,” adds co-author Spandan Madan, a graduate student at Harvard University.

Bono and Madan collaborated with Ishaan Grover, an MIT graduate student; Mao Yasueda from Yale University; Cynthia Breazeal, a professor of media arts and sciences and head of the Personal Robotics Group at MIT; Hanspeter Pfister, an An Wang Professor of Computer Science at Harvard; and Gabriel Kreiman, a professor at Harvard Medical School. Their research will be presented at the Association for the Advancement of Artificial Intelligence Conference.

Challenges in Training

The researchers aimed to understand why reinforcement learning agents often show poor performance when assessed in environments that differ from their training scenarios.

Reinforcement learning is a method based on trial and error, where the agent navigates a training environment and learns to undertake actions that maximize rewards.

The team introduced a method that specifically added a certain degree of variability, or noise, to an element of the reinforcement learning framework known as the transition function. This function represents the likelihood that an agent transitions from one state to another based on its chosen action.

For example, while playing Pac-Man, the transition function might dictate the probability of the game’s ghosts moving in various directions. In standard reinforcement learning, the AI would be trained and evaluated using the same transition function.

When the researchers injected noise into the transition function while following this traditional method, as anticipated, it negatively impacted the agent’s performance in Pac-Man.

However, they found that if the agent was trained using a noise-free version of Pac-Man and then tested in a noisy environment, it outperformed an agent that had been trained in the noisy setup.

“Typically, it is advised to match the transition function to the deployment conditions during training for optimal results. We rigorously tested this belief because we found it hard to accept,” Madan states.

By varying the amounts of noise in the transition function, the researchers were able to explore numerous environments, but this method did not yield realistic gameplay. Increased noise meant the ghosts would unpredictably teleport across the game grid.

To confirm if the indoor training effect was present in standard Pac-Man games, the researchers modified the probabilities so that ghosts moved more traditionally, with a higher likelihood to move vertically rather than horizontally. AI agents trained in noise-free settings still showed superior performance in these realistic scenarios.

“This wasn’t solely because of the noise introduced to create makeshift environments. This seems to be an inherent characteristic of the reinforcement learning challenge, which was even more astonishing to observe,” Bono notes.

Understanding Exploration

Upon further investigation into the reasons behind their findings, the researchers discovered correlations in the exploration patterns of the AI agents within the training environment.

When both AI agents generally explored similar areas, the agent trained in the non-noisy environment showcased superior performance, potentially due to its ability to grasp the game’s mechanics without the interference of disruptive elements.

Conversely, when their exploration patterns diverged significantly, the agent trained in the noisy environment often excelled, possibly because it had to decipher patterns unattainable in the noise-free setting.

“If I learn to play tennis using my forehand solely in a non-noisy setting and then transition to a noisy scenario requiring my backhand, I won’t perform as well in the non-noisy environment,” Bono clarifies.

Looking ahead, the researchers wish to investigate how the indoor training effect may be relevant in more intricate reinforcement learning situations or with alternative methodologies like computer vision and natural language processing. They also aim to create training environments specifically designed to exploit the indoor training effect, enhancing AI agents’ performance in uncertain conditions.