Shielding Yourself from the Rising Tide of Bank Fraud: Essential Strategies for Protection

Bank fraud is rampant. Your data could be anywhere. Here's how to protect yourself. If you feel that no bank account is entirely safe from scams and fraud these days, you aren’t being paranoid.  Three in 10 bank customers experienced fraudulent activity on their accounts in the past year, according to a first-ever bank fraud
HomeTechnologyMeasuring the True Success of Artificial Intelligence: Moving Beyond Hype

Measuring the True Success of Artificial Intelligence: Moving Beyond Hype

A recent article highlights that many journal publications on the effectiveness of machine learning models in solving specific equations tend to present an overly positive view. The study’s authors recommend implementing two guidelines for reporting outcomes and advocate for broader changes within the research community to foster transparency and precision in findings.
The excitement around machine learning, a subset of artificial intelligence, can create an impression that these tools will soon tackle all scientific dilemmas. Although remarkable assertions are made, they do not consistently withstand detailed examination. While machine learning shows promise in certain areas, it may not be as effective in others.

In a recent study published in Nature Machine Intelligence, researchers from the U.S. Department of Energy’s Princeton Plasma Physics Laboratory (PPPL) and Princeton University conducted a comprehensive analysis of studies comparing machine learning approaches to conventional methods for addressing fluid-related partial differential equations (PDEs). These equations are crucial across a range of scientific disciplines, such as the plasma research necessary for advancing fusion power for generating electricity.

The researchers observed that the comparisons made between machine learning techniques for solving fluid-related PDEs and traditional techniques were frequently skewed to favor machine learning. They also noted that negative outcomes were usually underreported. As a solution, they proposed guidelines for equitable comparisons but contend that cultural shifts are essential to address what appears to be systemic issues.

“Our findings suggest that while machine learning possesses significant potential, the current literature presents an overly rosy view of its effectiveness in solving these specific equations,” commented Ammar Hakim, PPPL’s deputy head of computational science and principal investigator of the study.

Evaluating results against weak benchmarks

PDEs are prevalent in physics and play a vital role in illustrating natural occurrences, such as heat transfer, fluid dynamics, and wave phenomena. For instance, PDEs can be employed to calculate the temperature distribution along a spoon submerged in hot soup. Given the initial temperatures of both the soup and spoon, along with the spoon’s material, a PDE could yield the temperature at any specified point on the spoon after being placed in the soup. These equations find substantial application within plasma physics, as many governing equations for plasmas share mathematical properties with fluid equations.

Scientists and engineers have developed various strategies for solving PDEs, one of which is numerical methods that seek approximate solutions for challenging or unsolvable problems through numerical computation rather than symbolic analysis. Recently, researchers have begun investigating the capacity of machine learning to address these PDEs, aiming to achieve quicker solutions than traditional methods.

The review indicated that the success of machine learning in journal articles is often overstated. “Our findings reveal that machine learning may occasionally offer slight speed advantages in solving fluid-related PDEs, but in general, numerical methods are more efficient,” explained Nick McGreivy, lead author of the paper and a recent doctoral graduate from the Princeton Program in Plasma Physics.

Numerical methods must navigate a vital trade-off between accuracy and time taken for problem-solving. “Investing additional time usually leads to improved accuracy,” noted McGreivy. “Many articles failed to account for this in their assessments.”

Moreover, speed can vary dramatically among numerical methods. For machine learning techniques to be valuable, they must outperform the most efficient numerical methods, McGreivy emphasized. However, his study found that comparisons were frequently made with the slowest numerical methods.

Two principles for fair comparisons

In response to these issues, the paper recommends two rules for making comparisons fair. The first is to align machine learning methods with numerical methods of equivalent accuracy or runtime. The second rule is to contrast machine learning methods with an efficient numerical approach.

Of the 82 journal articles reviewed, 76 reported that machine learning had outperformed numerical methods. The researchers discovered that 79% of those papers featuring machine learning as superior relied on weak benchmarks, violating at least one of the proposed rules. In contrast, four articles claimed inferior performance compared to numerical methods, while two reported similar or varied outcomes.

“Very few articles mentioned machine learning performing poorly, not because it consistently excels, but because researchers seldom publish findings where it underperforms,” McGreivy remarked.

He believes that low standards for benchmarks are often fueled by misguided incentives within academia. “For article acceptance, presenting impressive results is advantageous. This leads researchers to strive for optimal results in their machine learning models, which is beneficial. However, it can also yield favorable findings if the baseline method being used isn’t particularly effective. So, researchers aren’t incentivized to enhance their baseline methods, which is detrimental,” he explained. As a result, scholars may focus intensely on their models while neglecting to optimize the comparison baseline.

The study also uncovered signs of reporting biases, such as publication bias, where researchers refrain from publishing findings that demonstrate their machine learning models performing fewer effectively than numerical counterparts. Outcome reporting bias might include omitting negative results or using non-standard success metrics to portray machine learning models more favorably. These collective biases tend to downplay negative findings and foster the illusion that machine learning outperforms in solving fluid-related PDEs more than it actually does.

“There is considerable hype in this field. We hope our research establishes guidelines for principled approaches in leveraging machine learning to push forward the field’s standards,” Hakim stated.

To address these ingrained cultural challenges, Hakim argues that funding organizations and major conferences should implement rules prohibiting weak comparisons or demand thorough explanations of the baseline methods used and the rationale behind their selection. “They should promote a skeptical attitude among researchers regarding their own findings,” Hakim advised. “If the results appear too promising, they probably are.”

This research was supported by DOE grants DE-AC02-09CH11466 and DE-AC02-09CH11466.