SURD is an algorithm designed to uncover causal connections within intricate systems. Its potential uses stretch from predicting climate changes to forecasting population trends and optimizing aircraft design.
Understanding causality is essential for grasping how the world functions. Knowing what influences changes in variables—whether they be species populations, voting behavior, stock market prices, or local climates—can guide us in shaping those variables moving forward.
However, pinpointing the origin of an effect can quickly become complicated in real-life scenarios, where various factors can overlap and obscure any causal relationships.
A group of engineers from MIT aims to simplify the quest for understanding causality. They have developed a method applicable to diverse situations that identifies which variables are likely to impact others within a complex system.
This method, functioning as an algorithm, analyzes historical data—like the changing numbers of various species in aquatic environments. It examines the interactions among all variables in a system and estimates how much a change in one variable (for instance, sardine populations over time) can forecast the condition of another (such as anchovy populations in the same area).
The engineers create a “causality map” that illustrates potential cause-and-effect relationships between variables. The algorithm clarifies these relationships, determining if they are synergistic—meaning one variable only affects another when accompanied by a second—or redundant, where a change in one variable mirrors the effect of another variable.
Additionally, the new algorithm can estimate “causal leakage,” referring to the proportion of a system’s behavior that cannot be explained by the known variables, indicating that other unknown factors may be influencing the system.
“The strength of our method is its adaptability across fields,” states Álvaro Martínez-Sánchez, a graduate student in MIT’s Department of Aeronautics and Astronautics (AeroAstro). “It can help us comprehend species evolution in ecosystems, neural communication in the brain, and the relationship between climate factors across regions, among other applications.”
The engineers also intend to use their algorithm to tackle challenges in aerospace engineering, aiming to identify design aspects of aircraft that could lower fuel consumption.
“By incorporating causality into our models, we expect to gain insights into how different aircraft design elements relate to efficiency,” explains Adrián Lozano-Durán, an associate professor in AeroAstro.
The engineers, together with MIT postdoc Gonzalo Arranz, have published their findings in a study featured in Nature Communications.
Identifying Connections
Recently, several computational techniques have emerged to analyze data from complex systems and identify causal relationships. These methods apply various mathematical frameworks intended to represent causality.
“Different methods utilize distinct mathematical definitions for causality,” remarks Lozano-Durán. “Although many definitions seem valid, they may not succeed under certain conditions.”
He adds that current methods often struggle to differentiate between types of causality. Specifically, they may not distinguish between “unique” causality, where one variable distinctly affects another, and “synergistic” or “redundant” links. For instance, a synergistic relationship occurs when one variable (like drug A) has no effect on another (like blood pressure) unless paired with a second variable (like drug B).
An illustration of redundant causality might be where a student’s study habits influence their chance of achieving good grades, but this effect is equivalent to that caused by another factor (like the amount of sleep they get).
“Some methods depend on the strength of variables to establish causality,” Arranz points out. “As a result, they might overlook connections between variables that are not strongly correlated yet are still significant.”
Communicating Information
In their innovative approach, the engineers adapted concepts from information theory—the study of message transmission over networks, rooted in ideas from the late MIT professor Claude Shannon. They designed an algorithm to assess any complex system of variables as a messaging network.
“We view the system as a network where variables share information, which can be measured,” explains Lozano-Durán. “When one variable transmits messages to another, it suggests a level of influence. This concept of information flow aids in assessing causality.”
The algorithm evaluates multiple variables at once, in contrast to other methods that typically analyze pairs of variables sequentially. It defines information based on the likelihood that a change in one variable corresponds to a change in another. This likelihood—and the amount of communication occurring between variables—may strengthen or weaken as more data from the system are evaluated over time.
This results in a causality map that indicates strong connections between network variables. From the nature and frequency of these connections, researchers can identify whether the variables have unique, synergistic, or redundant relationships. Similarly, the algorithm can estimate the extent of “causal leakage,” indicating how much of a system’s behavior remains unpredictable based on known information.
“Part of our method identifies whether there is an omission in our data,” states Lozano-Durán. “While we may not know precisely what’s absent, we recognize the necessity of incorporating additional variables to clarify what’s happening.”
The team tested the algorithm on various benchmark scenarios commonly used to evaluate causal inference, including predator-prey dynamics over time, temperature and pressure readings across different regions, and the co-evolution of multiple marine species. The algorithm successfully identified causal connections in all instances, outperforming most existing methods that can only handle select cases.
The technique, named SURD (Synergistic-Unique-Redundant Decomposition of causality), is accessible online for others to apply to their own systems.
“SURD has significant potential to advance multiple scientific and engineering disciplines such as climate studies, neuroscience, economics, epidemiology, social science, and fluid dynamics among others,” Martínez-Sánchez adds.
This research was partially funded by the National Science Foundation.