San Francisco, California,
10:00 AM

HSS Research Evaluates Whether AI Chatbots Provide Reliable Medical Information

Artificial intelligence (AI) chatbots are more accurate than expected when asked to answer medical questions about spine surgery, but patients still need to use extreme caution when turning to these tools for help with medical decision-making. That’s according to a study from HSS researchers being presented at the American Academy of Orthopaedic Surgeons annual meeting.

“In the past 20 years, the Internet has been probably the number-one place that people go for medical information,” says Sheeraz Qureshi, MD, MBA, co-chief of HSS Spine and co-author of the study. “I haven’t yet heard from any patients who have used ChatGPT or other chatbots in this way, but it’s definitely where we see things going. The same way that people use search engines to look for medical information now, we expect they will use chatbots in the future, including in their decision-making process.”

For this study a team of investigators identified nine frequently asked questions about cervical spine surgery, which they considered to be of particular clinical relevance. Question topics ranged from the benefits and drawbacks of different surgical approaches to side effects and recovery after surgery. The questions were inputted one at a time into ChatGPT version 3.5.

Two experts in cervical spine surgery who were not involved in designing the questions rated the chatbot’s responses on accuracy, appropriateness, and readability. On average, the responses received a score of 8.1/10, with a 3.9/5 for accuracy and a 2.2/3 for appropriateness. The main drawback the reviewers noted was that ChatGPT failed to provide comprehensive responses, often omitting important factors. For example, it described a particular procedure as being more challenging without mentioning that the level of challenge depended on patient indications as well as the surgeon's overall practice, training and comfort with the technique. 

The experts noted that the responses were easier for people to understand than research literature, which can be complicated for non-experts. Flesh-Kincaid Grade Level analysis determined that ChatGPT's responses were at the level of a junior in high school, compared with primary literature, which is aimed at scholars working in the field. They also appreciated that responses from ChatGPT were always prefaced with a statement regarding consulting an expert for medical advice.

Dr. Qureshi explains that one serious concern with using a chatbot versus a web search is that it’s not clear where the information is being sourced from. “Most people know not to blindly trust everything they find on the Internet,” he says. “If a search takes you to the webpage for HSS or another well-established medical center, you can feel confident that the information has been vetted by experts.”

The researchers plan to continue studying this topic to better understand how patients are using AI tools. Ultimately, their hope is to identify opportunities to ensure that when AI is responding to medical queries it is prioritizing the most reliable and credentialed information.

About HSS

HSS is the world’s leading academic medical center focused on musculoskeletal health. At its core is Hospital for Special Surgery, nationally ranked No. 1 in orthopedics (for the 14th consecutive year), No. 2 in rheumatology by U.S. News & World Report (2023-2024), and the best pediatric orthopedic hospital in NY, NJ and CT by U.S. News & World Report “Best Children’s Hospitals” list (2023-2024). In a survey of medical professionals in more than 20 countries by Newsweek, HSS is ranked world #1 in orthopedics for a fourth consecutive year (2023). Founded in 1863, the Hospital has the lowest readmission rates in the nation for orthopedics, and among the lowest infection and complication rates. HSS was the first in New York State to receive Magnet Recognition for Excellence in Nursing Service from the American Nurses Credentialing Center five consecutive times. An affiliate of Weill Cornell Medical College, HSS has a main campus in New York City and facilities in New Jersey, Connecticut and in the Long Island and Westchester County regions of New York State, as well as in Florida. In addition to patient care, HSS leads the field in research, innovation and education. The HSS Research Institute comprises 20 laboratories and 300 staff members focused on leading the advancement of musculoskeletal health through prevention of degeneration, tissue repair and tissue regeneration. In addition, more than 200 HSS clinical investigators are working to improve patient outcomes through better ways to prevent, diagnose, and treat orthopedic, rheumatic and musculoskeletal diseases. The HSS Innovation Institute works to realize the potential of new drugs, therapeutics and devices. The HSS Education Institute is a trusted leader in advancing musculoskeletal knowledge and research for physicians, nurses, allied health professionals, academic trainees, and consumers in more than 165 countries. The institution is collaborating with medical centers and other organizations to advance the quality and value of musculoskeletal care and to make world-class HSS care more widely accessible nationally and internationally.