SURD is an algorithm that identifies causal relationships in intricate systems. Its potential applications range from predicting climate changes to estimating population growth and creating efficient aircraft designs.
Understanding causality is vital for grasping the dynamics of our world. Knowing what triggers a change in a variable—whether it’s a species in nature, a voting demographic, company shares, or local weather—can guide us in how to influence that variable in the future.
However, pinpointing the root cause of an effect can be overwhelmingly complicated in real-world scenarios, where numerous variables can complicate and obscure causal relationships.
A team of engineers from MIT is working on offering clarity in these complex scenarios. They have created a technique applicable across various fields to identify which variables are likely to affect others in a multifaceted system.
This method, implemented as an algorithm, analyzes time-series data—like the fluctuating populations of marine species. It evaluates the connections between all variables within a system and predicts how changes in one variable (for instance, the population of sardines over time) may indicate shifts in another (like anchovy populations in the same area).
The engineers produce a “causality map” linking variables that appear to exhibit a cause-and-effect relationship. The algorithm categorizes these interactions, classifying them as either synergistic—where one variable influences another only when paired with a second variable—or redundant, where a change in one variable has the same effect as a change in another variable.
Additionally, the algorithm can assess “causal leakage,” indicating to what extent a system’s behavior cannot be solely explained by the available variables, suggesting that unknown factors are also influencing outcomes.
According to Álvaro Martínez-Sánchez, a graduate student in MIT’s Department of Aeronautics and Astronautics (AeroAstro), “Our method’s significance lies in its broad applicability across various fields. It can enhance our understanding of species evolution, neuronal communication, and the interactions of climatological factors across different regions.”
In the realm of aerospace, engineers intend to employ this algorithm to tackle issues like discovering design features that can enhance aircraft fuel efficiency.
Adrián Lozano-Durán, an associate professor in AeroAstro, says, “By incorporating causality into our models, we hope to gain better insights into how aircraft design variables relate to overall efficiency.”
The engineers, accompanied by MIT postdoc Gonzalo Arranz, have shared their findings in a publication in Nature Communications.
Understanding Connections
Recently, various computational methods have emerged that utilize data from complex systems to establish causal links between variables through specific mathematical models that represent causality.
Lozano-Durán comments, “Different methods rely on unique mathematical definitions to determine causality, and while they may all seem appropriate, some can falter under certain situations.”
He adds that current techniques fail to differentiate between distinct causality types. For instance, a “unique” causality is when one variable alone directly affects another, unlike a “synergistic” or “redundant” cause. A synergistic example would be drug A not impacting blood pressure unless combined with drug B.
On the other hand, redundant causality occurs when a student’s study habits influence their grades similarly to how their sleep patterns might.
Arranz explains, “Other methods usually measure causality based on variable intensity, which can overlook significant links among variables with less pronounced effects.”
Information Transfer
In their innovative approach, the team leaned on principles from information theory, founded by the late MIT professor Claude Shannon. They devised an algorithm that assesses complex variable systems as messaging networks.
“We view the system as a network, where the exchange of information among variables can be quantified,” Lozano-Durán clarifies. “If one variable conveys messages to another, it suggests an influence. Thus, we employ information flow to gauge causality.”
The new algorithm evaluates numerous variables concurrently, compared to conventional methods that examine them in pairs. It defines information as the probability that a change in one variable corresponds with a change in another, which can become stronger or weaker as more data is analyzed over time.
This approach ultimately produces a causality map, illustrating which network variables are closely interconnected. Through the rates and patterns of these connections, researchers can identify whether relationships are unique, synergistic, or redundant. The same technique allows them to estimate the degree of “causality leakage” in the system, indicating how much behavior cannot be predicted from the known information.
“Part of our methodology helps identify missing elements,” Lozano-Durán adds. “While we can’t specify what’s absent, we understand more variables need to be included to clarify the situation.”
The team tested the algorithm on several standard cases for causal inference, covering predator-prey interactions, air temperature and pressure measurements, and the co-evolution of multiple marine species. The algorithm successfully identified causal connections in all instances, in contrast to most methods that struggle with certain cases.
Naming their method SURD—Synergistic-Unique-Redundant Decomposition of Causality—the team has made it available online for others to apply in their systems.
“SURD has the potential to advance multiple scientific and engineering disciplines, including climate science, neuroscience, economics, epidemiology, social sciences, and fluid dynamics,” concludes Martínez-Sánchez.
This research received partial support from the National Science Foundation.