During the pandemic, more patients at NYU Langone Health turned to electronic health record tools to communicate with their doctors. This shift led to an increase in the use of the In Basket communication tool integrated into NYU Langone’s electronic health record system, EPIC.
Physicians at NYU Langone have seen a significant rise in the number of daily messages received, with some receiving over 150 In Basket messages per day. This surge in messages has placed a burden on healthcare professionals, contributing to burnout rates among physicians.
A recent study conducted by researchers at NYU Grossman School of Medicine demonstrates that an AI tool can craft responses to patients’ EHR queries that are as accurate as those by human healthcare providers, while also perceived as more empathetic. The study suggests that such AI tools could alleviate physicians’ workload associated with managing In Basket messages and enhance communication with patients, provided that human providers review the AI-generated responses before sending them out.
NYU Langone Health has been exploring the capabilities of generative artificial intelligence (genAI) that leverage computer algorithms to predict the next word in a sentence based on internet language patterns. This technology allows for the creation of “chatbots” that can respond to queries using human-like language. In 2023, NYU Langone acquired a private instance of GPT4, a descendant of the chatGPT chatbot, for physicians to experiment with while safeguarding patient data privacy.
The study, published in JAMA Network Open on July 16, had primary care physicians compare AI-generated responses to patient queries with human responses. The findings indicated that AI responses matched human responses in accuracy, relevance, and completeness. AI responses excelled in terms of tone and understandability, outperforming human providers by 9.5%. They were also more empathetic and positive in language use, fostering a sense of partnership with patients.
Despite these advantages, AI responses were longer and used more complex language compared to responses by human providers. There is recognition of the need for additional training to improve the AI tool. While AI responses were rated at an 8th-grade readability level, higher than the 6th-grade level of human responses, there is room for enhancement in the tool’s readability.
The researchers emphasized the significance of using private patient data rather than general internet data to train chatbots, as it mirrors real-world scenarios more accurately. Future studies will be crucial in evaluating the impact of private data on AI tool performance.
Dr. Devin Mann, the senior director of Informatics Innovation in NYU Langone Medical Center Information Technology (MCIT) and corresponding author of the study, expressed confidence that with physician oversight, AI-generated responses will soon match human responses in quality, communication style, and usability.
The study was conducted by a team of authors from NYU Langone Health, including Drs. Small and Mann, along with researchers from other departments and institutions. Funding for the study was provided by National Science Foundation and Swiss National Science Foundation grants.