An illustration of a human doctor and a robot doctor talking

Credit: Getty Images

Artificial intelligence

The doctor is out, but it's OK. ChatGPT can answer your questions

A new study finds ChatGPT outperforms human physicians in quality and empathy of responses to patient concerns

Johns Hopkins Media Relations
Office phone

Although artificial intelligence won't replace your doctor any time soon, a new study has found that technologies such as ChatGPT could improve patients' experience by providing responses to their healthcare questions that are more accurate and in a manner they perceive as more empathetic than answers from human doctors.

The study, appearing in JAMA Internal Medicine, compared written responses from human physicians and those from ChatGPT to real-world health questions. A panel of licensed health care professionals evaluating the responses preferred ChatGPT's answers 79% of the time, and found them more empathetic and of higher quality.

"The demand for doctors to answer questions via electronic patient messaging these days is overwhelming, so it is not surprise that physicians not only are experiencing burnout, but also that the quality of those answers sometimes suffers. This study is evidence that AI tools can make doctors more efficient and accurate, and patients happier and healthier," said study co-author Mark Dredze, an associate professor of computer science at Johns Hopkins University's Whiting School of Engineering, who advised the research team on the capabilities of large language models. Dredze is also director of research (foundations of AI) at Johns Hopkins AI-X Foundry, which drives AI research and its applications in health, medicine, and safety with the goal of understanding and improving the human condition.

Study leader John W. Ayers, of the Qualcomm Institute at the University of California San Diego, says the results provide an early glimpse into the important role that artificial intelligence assistants could play in health care.

"The opportunities for improving health care with AI are massive," said Ayers, who is also vice chief of innovation in the UC San Diego School of Medicine Division of Infectious Disease and Global Public Health. "AI-augmented care is the future of medicine."

The research team behind the study set out to answer the question: Can ChatGPT respond accurately to the types of questions patients send to their doctors?

To obtain a large and diverse sample of health care questions and physician answers that did not include identifiable personal information, the team turned to Reddit's AskDocs, a social media forum where patients publicly post medical questions to which doctors respond.

"We could use these technologies to train doctors in patient-centered communication, eliminate health disparities suffered by minority populations who often seek health care via messaging, build new medical safety systems, and assist doctors by delivering higher quality and more efficient care."
Mark Dredze
Associate professor of computer science

r/AskDocs is a subreddit with approximately 452,000 members who post medical questions and verified health care professionals submit answers. While anyone can respond to a question, moderators verify health care professionals' credentials and responses display the respondent's level of credentials. The result is a large and diverse set of patient medical questions and accompanying answers from licensed medical professionals.

While some may wonder if question-answer exchanges posted on social media are a fair way to test this, clinical team members noted that the exchanges reflected their clinical experience.

The team randomly sampled 195 exchanges from AskDocs where a verified physician responded to a public question. The team provided the original question to ChatGPT and asked it to author a response. A panel of three licensed health care professionals assessed each question and the corresponding responses and were blinded to whether the response originated from a physician or ChatGPT. They compared responses based on information quality and empathy, noting which one they preferred.

The result? The panel of health care professional evaluators preferred ChatGPT responses to physician responses almost 80% of the time.

"ChatGPT messages responded with nuanced and accurate information that often addressed more aspects of the patient's questions than physician responses," said study co-author Jessica Kelley, a nurse practitioner with San Diego firm Human Longevity.

Additionally, ChatGPT responses were rated significantly higher in quality than physician responses: Good or very good quality responses were 3.6 times higher for ChatGPT than physicians (physicians 22.1% versus ChatGPT 78.5%). The responses were also more empathic: Empathetic or very empathetic responses were 9.8 times higher for ChatGPT than for physicians (physicians 4.6% versus ChatGPT 45.1%).

"There have been several studies showing that these AI models can pass medical licensing questions, but that doesn't mean they would provide good answers to questions from real people. This study shows that they can," Dredze says. "We aren't proposing that we build AI doctors, but our results suggest that doctors could be more effective when aided by AI."

Aaron Goodman, an associate clinical professor at UC San Diego School of Medicine and study coauthor, says, "I never imagined saying this, but ChatGPT is a prescription I'd like to give to my inbox. The tool will transform the way I support my patients."

In addition to improving workflow, investments into AI assistant messaging could impact patient health and physician performance, the study authors say.

"We could use these technologies to train doctors in patient-centered communication, eliminate health disparities suffered by minority populations who often seek health care via messaging, build new medical safety systems, and assist doctors by delivering higher quality and more efficient care," says Dredze. "When doctors are overwhelmed, empathy with their patients can be the first thing to go. But empathy is critical in care: A patient doesn't listen to a doctor if they don't feel heard. This study is evidence that AI could help doctors maintain empathetic and accurate communication with their patients."