Researchers have developed a sophisticated language model capable of analyzing medical charts to check if children with ADHD are receiving appropriate follow-up care after starting new medications.
At Stanford Medicine, scientists have created an AI tool that reviews extensive doctors’ notes found in electronic medical records, helping to identify trends aimed at enhancing patient care.
Healthcare experts usually need to meticulously examine a multitude of medical charts for insights regarding patient care. However, recent findings indicate that large language models—AI systems adept at recognizing patterns in complex text—can automate this labor-intensive process, yielding results that have practical implications. For example, these AI tools can track patient records for hazardous drug interactions or assist physicians in determining which patients may respond favorably or unfavorably to particular treatments.
The AI tool, detailed in a study published online on December 19 in Pediatrics, was created to analyze medical records and ascertain whether children diagnosed with attention deficit hyperactivity disorder (ADHD) received the necessary follow-up care after beginning new medications.
“This model allows us to identify some shortcomings in ADHD management,” said Dr. Yair Bannett, the lead author of the study and assistant professor of pediatrics.
The senior author of the study is Dr. Heidi Feldman, the Ballinger-Swindells Endowed Professor in Developmental and Behavioral Pediatrics.
The research team utilized the AI tool’s insights to identify strategies that could enhance how physicians follow up with ADHD patients and their families. Bannett noted that the capabilities of such AI technology could be leveraged in various areas of medical care.
A laborious task for humans, a simple one for AI
Electronic medical records provide data like lab results or blood pressure readings in formats that are easily analyzable by computers. However, around 80% of the data in these records come from the freeform notes that physicians write about their patients’ care.
While these notes are useful for the next healthcare professional who reviews a patient’s chart, their informal phrasing makes large-scale analysis challenging. To utilize this unstructured data for research, it usually requires a person to manually sift through the notes for specific details. This study explored whether artificial intelligence could take on that task.
Researchers analyzed medical records of 1,201 children aged 6 to 11 from 11 pediatric primary care practices within the same healthcare network, all of whom had been prescribed at least one ADHD medication. Since such medications can cause disruptive side effects, such as reduced appetite, it’s essential for doctors to check on side effects soon after the medication is prescribed and adjust dosages as needed.
The team trained an established large language model to sift through doctors’ notes to find out whether children or their parents were asked about side effects within the first three months of taking the new medication. The model was trained on 501 reviewed notes, where any mention of side effects (e.g., “reduced appetite” or “no weight loss”) indicated that follow-up communication had occurred. Conversely, notes lacking any mention of side effects were taken as a sign that follow-up didn’t happen.
The keystone notes served as “ground truth” for the model. The researchers initially used 411 notes to train the model on identifying inquiries about side effects, and the remaining 90 notes for validation. After an additional review of 363 notes, they found that the model successfully categorized around 90% of the notes correctly.
Once the large language model was functioning effectively, the researchers employed it to rapidly analyze all 15,628 notes across patient charts, a task that would have otherwise required over seven months of full-time labor without AI assistance.
Turning analysis into improved care
The AI’s evaluation revealed insights that might have otherwise gone unnoticed. For instance, it identified that some pediatric practices often inquired about medication side effects during phone calls with parents, while others did not.
“This is something you would miss without deploying this model to analyze 16,000 notes, as no individual would undertake such a task,” Bannett remarked.
The AI also uncovered that pediatricians asked follow-up questions about specific medications less frequently. Children with ADHD might be prescribed stimulants or, less commonly, non-stimulant alternatives such as certain anti-anxiety medications. The findings suggested that doctors were less likely to ask about the latter category.
This discovery highlights the limitations of AI, according to Bannett, who emphasized that while AI could track trends in patient records, it couldn’t explain the reasons behind those trends.
“We needed to consult pediatricians for context,” he added, explaining that many cited their broader experience managing stimulant side effects compared to those of non-stimulants.
Researchers acknowledged that the AI tool might have overlooked some inquiries regarding medication side effects since not all conversations about side effects were documented in the electronic medical records, and some patients received specialized care—such as from psychiatrists—that wasn’t captured in the records used for the study. The AI also misidentified some physician notes related to side effects of treatments for other conditions, like acne medication.
Directing AI’s capabilities
As researchers continue to develop AI tools for medical investigations, it’s crucial to recognize both their strengths and weaknesses, Bannett advised. Some responsibilities, like filtering through extensive medical records, are ideal for trained AI systems.
However, deeper understanding of the ethical elements within healthcare requires careful human judgment. Bannett noted an editorial he and colleagues published in Hospital Pediatrics that discusses potential challenges and possible solutions.
“These AI models analyze existing healthcare data, and numerous studies have shown that disparities exist within this healthcare data,” said Bannett. It’s essential for researchers to consider how to reduce such biases when creating AI tools and when using them, expressing optimism about AI’s potential to enhance doctors’ capabilities.
“Every patient is unique, and each clinician has a wealth of expertise, but AI can provide access to comprehensive data from larger populations,” he stated. Eventually, AI could help doctors anticipate the likelihood of negative side effects specific to a drug based on a patient’s age, race or ethnicity, genetic makeup, and combination of diagnoses, allowing for more tailored medical decisions.
This research was funded by the Stanford Maternal and Child Health Research Institute and the National Institute of Mental Health (grant K23MH128455).