Researchers have discovered that large language models (LLMs) can be utilized to effectively identify anomalies in time-series data, all without the expensive and complicated training setups typically required. This technique might eventually assist technicians in recognizing potential issues in equipment such as wind turbines or satellites.
Detecting a single malfunctioning turbine in a wind farm, which may involve analyzing countless signals and millions of data points, is similar to searching for a needle in a haystack.
Engineers commonly tackle this intricate issue with deep learning models designed to spot anomalies in the time-series data collected by each turbine over time.
However, with numerous wind turbines each generating multiple signals hourly, training a deep learning model to analyze this time-series data can be both expensive and labor-intensive. Additionally, operators of wind farms may lack the necessary machine-learning knowledge, and the models may require retraining after being deployed.
A recent study from MIT researchers indicates that large language models could serve as more effective and efficient anomaly detectors for time-series data. Notably, these pretrained models can be implemented immediately without extensive additional training.
The researchers created a system called SigLLM, which includes a feature that transforms time-series data into text formats that LLMs can process. Users can provide this processed data to the model, asking it to identify any anomalies. The LLM can also forecast future time-series data points as part of its anomaly detection process.
Although LLMs did not surpass cutting-edge deep learning models in anomaly detection, they performed comparably to some alternative AI techniques. With further enhancements, this approach could enable technicians to identify potential failures in equipment such as heavy machinery or satellites before they happen, all without having to train an expensive deep-learning model.
“This is just the initial step; we didn’t anticipate confirming everything right away, but the results suggest there’s a significant opportunity to utilize LLMs for complex anomaly detection tasks,” shares Sarah Alnegheimish, a graduate student in electrical engineering and computer science (EECS) and lead author of the study on SigLLM.
Co-authors of the study include Linh Nguyen, another EECS graduate student; Laure Berti-Equille, a research director at the French National Research Institute for Sustainable Development; and Kalyan Veeramachaneni, a senior research scientist at the Laboratory for Information and Decision Systems. The findings will be presented at the IEEE Conference on Data Science and Advanced Analytics.
An off-the-shelf solution
Large language models operate in an auto-regressive manner, meaning they understand that the latest values in sequential data depend on previous ones. For example, models like GPT-4 can predict a subsequent word in a sentence based on the words that came before it.
Given that time-series data is sequential, the researchers believed that the auto-regressive feature of LLMs might make them suitable for identifying anomalies in this type of data.
However, their goal was to devise a method that dodges the need for fine-tuning, a process where engineers retrain a versatile LLM on a small set of specific task data to enhance its performance for that task. Instead, they aimed to implement an LLM directly without any additional training steps.
To deploy the model, they first needed to convert time-series data into text-based formats that the language model could recognize.
This was achieved through a series of transformations that focused on retaining the most crucial aspects of the time series while minimizing the number of tokens used. Tokens serve as the basic inputs for an LLM; thus, using more tokens demands more computational resources.
“If these transformations aren’t managed carefully, crucial data might be lost, which could affect outcomes,” Alnegheimish explains.
Once the conversion process for the time-series data was established, the researchers introduced two methods for detecting anomalies.
Methods for Anomaly Detection
In the first method, dubbed Prompter, the prepared data is inputted into the model with prompts guiding it to identify anomalies.
“We had to refine our approach multiple times to determine the right prompts for a specific time series. Comprehending how these LLMs intake and process data is challenging,” Alnegheimish notes.
For the second method, known as Detector, the LLM is utilized as a forecaster to estimate the next value in a time series. Researchers then compare this predicted value with the actual one, where a significant variation indicates a potential anomaly.
Both methods functioned in different ways: Detector integrates with an anomaly detection pipeline, while Prompter can independently fulfill the task. In practical terms, Detector yielded better results than Prompter, which had a tendency to generate numerous false positives.
“With the Prompter method, we may have made the task too complex for the LLM, presenting it with a tougher issue to solve,” observes Veeramachaneni.
When both methods were compared against existing techniques, Detector excelled over transformer-based AI models in seven out of eleven datasets assessed, all without any training or fine-tuning on the LLM.
Looking ahead, LLMs might also provide straightforward explanations alongside their predictions, allowing operators to grasp why a particular data point was marked as anomalous.
Nonetheless, current state-of-the-art deep learning models significantly outperformed LLMs, highlighting the need for further advancements before LLMs can effectively serve in anomaly detection.
“What will be required to reach a level of performance comparable to these leading models? That remains a pressing question. An anomaly detection system based on LLMs must prove transformative for us to warrant further investment,” Veeramachaneni indicates.
In the future, researchers hope to explore whether fine-tuning can enhance performance, though such an approach would necessitate additional time, costs, and specialized training. The LLM methods currently take between 30 minutes and two hours to generate results, so improving efficiency will be a significant focus going forward. The team also aims to delve deeper into understanding how LLMs manage anomaly detection to potentially enhance their effectiveness.
“In the realm of intricate tasks like anomaly detection in time series, LLMs have the potential to be strong contenders. Perhaps LLMs can also address other complex challenges?” Alnegheimish muses.
This research received support from SES S.A., Iberdrola and ScottishPower Renewables, along with Hyundai Motor Company.