Human-AI Teams Outperform Both Humans and AI Alone at Finding Cancer Trial Patients

An Emory study found that pairing clinical staff with AI tools improved accuracy in identifying eligible cancer patients without adding to workload.

A randomized trial at Emory’s Winship Cancer Institute found that pairing human reviewers with AI tools improved the accuracy of identifying cancer patients for clinical trials - without adding to staff workload. Neither humans nor machines alone achieved the same results.

The study, published February 10 in Nature Communications, tested three approaches to eligibility prescreening: human review alone, AI alone, and human-AI collaboration. The combined approach outperformed both alternatives, particularly for complex eligibility criteria involving tumor biomarkers and cancer staging.

The Problem Being Solved

Clinical trial prescreening is labor-intensive and error-prone. Research staff must review voluminous electronic health records to determine whether patients meet a trial’s eligibility criteria - specific tumor types, biomarker status, prior treatments, organ function, and dozens of other factors. This manual process introduces inconsistencies and often misses eligible patients.

The errors aren’t random. Patients from underrepresented groups are disproportionately affected. Their medical records tend to be more fragmented, contain care delivered across multiple health systems, or lack standardized documentation of key clinical details. A screening system that struggles with complex or incomplete records will systematically exclude these patients.

What the Study Found

The key finding defied a common assumption: the best approach wasn’t choosing between humans or machines. The human-AI team significantly outperformed manual prescreening alone, with the combined method more effectively identifying eligible patients while reducing both false positives and false negatives.

AI proved particularly valuable in specific areas:

  • Tumor biomarkers: Identifying specific genetic mutations or protein expressions that determine trial eligibility
  • Cancer staging: Accurately classifying disease extent based on imaging reports, pathology notes, and clinical assessments
  • Complex eligibility criteria: Processing the multifaceted requirements that span dozens of pages in modern trial protocols

The collaboration worked because each component brought distinct strengths. The AI system - built on transformer-based deep learning models fine-tuned on clinical corpora - could process large volumes of unstructured clinical notes quickly. Human reviewers brought clinical judgment, contextual understanding, and the ability to interpret ambiguous or contradictory information.

Importantly, AI did not significantly reduce the time required for chart review. The efficiency gain came in accuracy, not speed. Reviewers supported by AI made fewer errors without spending more time per case.

Practical Implications

Dr. Ravi B. Parikh, a medical oncologist and researcher at Winship Cancer Institute who co-led the study, estimated the real-world impact: “At a high-volume cancer center, that improvement could translate to 10 to 20 additional patients screened each week.” In practical terms, that means roughly one more patient per day gaining access to investigational treatments.

The study’s design matters for generalization. Researchers used a randomized controlled trial methodology with retrospectively curated electronic health records, which provides stronger evidence than typical AI proof-of-concept studies. The system also emphasized interpretable outputs - transparent decision rationale that clinicians could evaluate and trust.

What This Means

The findings support a specific model of AI deployment: augmentation rather than replacement. The best results came when AI handled what it does well (processing structured criteria across large datasets) while humans handled what they do well (judgment calls, context, and ambiguity).

This has implications beyond oncology. Any clinical workflow that involves reviewing complex records against detailed criteria - prior authorization, quality measurement, care coordination - faces similar challenges. The human-AI collaboration model tested here could apply broadly.

The Fine Print

The study has important limitations. The research used retrospective data with known outcomes, which differs from prospective screening where the correct answer isn’t known in advance. The specific AI system tested was designed for oncology eligibility criteria and may not generalize to other therapeutic areas.

The study also didn’t address how to scale this approach. Training clinical staff to work effectively with AI tools, maintaining the AI system as trial protocols evolve, and integrating these workflows into existing processes all present implementation challenges the paper doesn’t resolve.

Perhaps most importantly, improved prescreening is only one step in trial access. Patients must still be invited to participate, must choose to enroll, and must be able to manage the logistics of trial participation. More accurate screening helps, but it doesn’t solve the broader challenge of trial access.

Still, the core finding is clear: when it comes to complex clinical tasks, the question shouldn’t be “AI or humans?” It should be “how do we combine them effectively?”