29 May 2023

A study reveals ChatGPT fails to pass American College of Gastroenterology tests

According to a recent study published in The American Journal of Gastroenterology, both ChatGPT-3 and ChatGPT-4, OpenAI's language processing models, performed poorly on the 2021 and 2022 American College of Gastroenterology Self-Assessment Tests. The study, conducted by researchers at The Feinstein Institutes for Medical Research, aimed to evaluate the models' abilities and accuracy by asking them to answer multiple-choice questions from the tests.


Each test comprised 300 questions, and researchers inputted these questions, excluding those requiring images, into the AI-powered platform. ChatGPT-3 correctly answered 296 out of 455 questions, while ChatGPT-4 answered 284 correctly. However, to pass the test, individuals need to achieve a score of 70% or higher. ChatGPT-3 scored 65.1%, and ChatGPT-4 scored 62.4%.


The American Board of Internal Medicine uses the self-assessment test to assess how individuals would perform on the Gastroenterology board exam. The study's researchers identified possible reasons for ChatGPT's low performance, such as limited access to paid medical journals or outdated information within the system. They concluded that more research is necessary to establish the model's reliability and suitability for medical applications.


However, a separate study published in PLOS Digital Health earlier this year demonstrated ChatGPT's success in passing or nearly passing the United States Medical Licensing Exam, indicating a high level of insight in its explanations. Additionally, in a research letter published in JAMA, ChatGPT provided "largely appropriate" responses to questions about cardiovascular disease prevention, with clinicians rating 21 out of 25 questions as appropriate.


Overall, while ChatGPT has shown promise in certain medical applications, further research and improvement are needed before it can be considered a reliable tool for medical education and practice in the field of gastroenterology and other healthcare areas.


Click here to read the original news story.