GPT-4 outperforms 99.98% of simulated human readers in complex clinical diagnoses

In a recent study published in the New England Journal of Medicine, OpenAI's GPT-4 exhibited impressive diagnostic capabilities, accurately diagnosing complex clinical cases at a rate of 52.7%. This outperformed medical journal readers (36%) and surpassed 99.98% of simulated human readers. Conducted by Danish researchers, the evaluation involved presenting 38 cases to GPT-4 and comparing responses with 248,614 online medical journal readers.

The study highlighted the most common diagnoses, including infectious diseases (39.5%), endocrinology (13.1%), and rheumatology (10.5%). Patient demographics ranged widely, from newborns to 89-year-olds, with 37% being female.

A temporal analysis of GPT-4's performance revealed 52.7% accuracy for cases published up to September 2021 and an improved 75% accuracy for cases published thereafter. Despite these promising results, the study noted a slight decrease in performance in the newest version of GPT-4.

While emphasising GPT-4's high reproducibility and clinical promise, researchers urged caution, stressing the need for proper clinical trials to ensure safety and efficacy. The study also underscored the importance of ethical considerations, transparency, and regulatory adherence. Addressing concerns about data protection and privacy, the authors called for future AI models to include training data from developing countries, promoting global applicability.

The study envisions a future where AI, like GPT-4, becomes a valuable tool in healthcare decision-making, complementing human oversight rather than replacing medical professionals.

Click here to read the original news story.

How Clinical Data is Evolving from Documentation to Decision Intelligence

Patient-Centricity in the Age of AI

VIP Pass: Connecting leaders to transform healthcare!

Executive Masterclass 2 - Direct to Patient, Redefined: Pharma’s Next Commercial Frontier

Rethinking Employer Health Strategies: Balancing Cost, Care, and Culture

Advancing High-Value Care Through Smarter Decision Support and Population Strategies

What does “Direct-to-Patient” engagement realistically mean in Europe - and what needs to change for it to work?

Highway to HLTH: Digital Health's Next Wave - The Tech Trends Defining the Sector in 2026

Highway to HLTH: The Benefits Breakdown - Key Benefits Trends Employers Need to Know

GPT-4 outperforms 99.98% of simulated human readers in complex clinical diagnoses

Featured:

How Clinical Data is Evolving from Documentation to Decision Intelligence

Sponsored by:

Patient-Centricity in the Age of AI

Sponsored by:

Featured:

VIP Pass: Connecting leaders to transform healthcare!

Executive Masterclass 2 - Direct to Patient, Redefined: Pharma’s Next Commercial Frontier

Featured:

Rethinking Employer Health Strategies: Balancing Cost, Care, and Culture

Sponsored by:

Advancing High-Value Care Through Smarter Decision Support and Population Strategies

Sponsored by:

Featured:

What does “Direct-to-Patient” engagement realistically mean in Europe - and what needs to change for it to work?

Sponsored by:

Featured:

Highway to HLTH: Digital Health's Next Wave - The Tech Trends Defining the Sector in 2026

Highway to HLTH: The Benefits Breakdown - Key Benefits Trends Employers Need to Know

GPT-4 outperforms 99.98% of simulated human readers in complex clinical diagnoses