This AI Reads Brain MRIs in Seconds and Knows When to Call a Neurosurgeon

A patient arrives at the emergency room with a sudden, crushing headache. A brain MRI is ordered. The images go into a queue. Somewhere in that queue - behind routine follow-ups, chronic condition check-ins, and post-surgical assessments - the scan waits for a radiologist who is almost certainly overworked.

For strokes and brain hemorrhages, every minute of delay costs neurons. Yet the average wait time for MRI interpretation varies wildly depending on the hospital, the hour, and how many radiologists are on staff. In many institutions, emergency reads still depend on whoever happens to be available.

Researchers at the University of Michigan have built something that could change this. Their system, called Prima, is a vision language model that reads brain MRI scans in seconds, identifies neurological conditions across 52 diagnostic categories, and - critically - knows when to sound the alarm. The research, led by neurosurgeon Dr. Todd Hollon, was published this month in Nature Biomedical Engineering.

What Prima Actually Does

Prima isn’t another narrow AI tool designed to spot one type of lesion or flag a single condition. It’s a foundation model for neuroimaging - a system that processes full MRI studies the way they arrive in clinical practice, not the way they’re prepared for research benchmarks.

The model is a vision language model, meaning it can simultaneously process video, images, and text in real time. Feed it an MRI study along with the patient’s clinical history and the reason the scan was ordered, and Prima returns a diagnosis, a confidence score, and a priority recommendation. If it detects something like a stroke or brain hemorrhage, it can automatically alert the appropriate subspecialist - a stroke neurologist, a neurosurgeon - before a human has even opened the file.

That last part matters. Most AI medical imaging tools produce a finding. Prima produces an action: this patient needs a specific type of specialist, and they need one now.

The Numbers

Hollon’s team trained Prima on the full digital radiology archive at University of Michigan Health: more than 200,000 MRI studies comprising 5.6 million imaging sequences. Unlike previous approaches that relied on small, hand-curated datasets, this training data included the full messy reality of clinical practice - every scan collected since the system went digital, along with the patients’ clinical histories and the physicians’ reasons for ordering each study.

They then tested it over a full year in a health system-wide study covering 29,431 MRI studies. Across 52 radiologic diagnoses spanning major neurological disorders, Prima achieved a mean diagnostic area under the curve (AUC) of 92.0%, with accuracy reaching 97.5% on certain conditions. It outperformed other state-of-the-art AI models, including general-purpose models, in head-to-head comparisons.

Those numbers are impressive, but the testing conditions deserve as much attention as the results. This wasn’t a curated benchmark set. It was a year of real clinical scans - the scans that actually arrive at a hospital, with all the variation in imaging quality, patient movement, and clinical complexity that entails.

Why This Is Different

The medical AI field has no shortage of tools that perform well on clean datasets and then struggle in practice. Just last week, an Oxford study found that AI chatbots scoring 94.9% on structured medical scenarios dropped to 34.5% when interacting with real patients describing their symptoms in plain language.

Prima was designed to avoid that trap. By training on health-system-scale data rather than curated subsets, the team forced the model to learn from the same distribution of cases it would encounter in deployment. The model didn’t get clean textbook examples. It got the full spectrum: common conditions alongside rare ones, clear-cut cases alongside ambiguous ones, high-quality scans alongside technically difficult ones.

The team also built in clinical context. Prior MRI AI tools typically receive an image and nothing else. Prima receives the clinical note - the reason the doctor ordered the scan in the first place. A 35-year-old with a new seizure gets a different diagnostic framework than a 70-year-old being monitored for a known brain tumor, even if the scans look similar. This contextual understanding is something radiologists do automatically. Most AI tools don’t.

Hollon’s team describes Prima as “a co-pilot for interpreting medical imaging studies” - not a replacement for radiologists, but a system that handles the initial read, flags urgencies, and routes patients to the right specialist faster than a queue-based workflow allows.

The Workforce Problem

This matters because the workforce problem in radiology isn’t getting better. The United States faces a growing shortage of radiologists, particularly neuroradiologists - the subspecialists qualified to interpret brain MRIs. Scan volumes keep increasing as imaging technology improves and clinical indications expand. The bottleneck is human reading capacity.

After hours, when many strokes and hemorrhages present, the situation is worse. Emergency departments often rely on teleradiology services or on-call radiologists reading from home, adding latency to time-sensitive cases. A system that can identify the urgent cases and push them to the front of the line - in seconds rather than minutes or hours - addresses one of the most dangerous failure points in acute neurological care.

The Caveats

The researchers are clear that Prima is in its “initial stage of evaluation.” The year-long study was conducted at a single institution using that institution’s own imaging data. Whether the model generalizes to other hospitals with different MRI equipment, imaging protocols, and patient populations remains to be tested.

Future work will focus on incorporating more detailed electronic medical records to improve accuracy further. The model also hasn’t been tested in a prospective clinical trial - a study where it actively influences patient care in real time, rather than being evaluated retrospectively on existing scans.

And there’s the broader question of what happens when AI triage systems make mistakes. A false negative - a hemorrhage the model doesn’t flag - could mean a delay in care that might not have occurred under the existing workflow. The bar for deployment isn’t just “better than nothing.” It’s “better than the current system, including its human fallibility.”

What This Means

Prima represents a shift in medical AI from narrow, task-specific tools to foundation models that approach clinical problems the way clinicians do - by integrating multiple data types, considering context, and making decisions about urgency.

The fact that it was trained on a health system’s entire digital archive rather than a curated dataset is significant. It means the model was built to handle the actual distribution of clinical cases, not an idealized version of them. Whether that translates to real-world performance at other institutions is the key question still unanswered.

The broader potential is substantial. The team has noted that Prima’s approach could extend to other imaging modalities - mammograms, chest X-rays, ultrasounds - using the same training philosophy of health-system-scale data combined with clinical context. If the architecture works across modalities, it could fundamentally change how imaging backlogs are managed.

For now, though, the most immediate impact would be the simplest one: an AI that reads a brain scan before a human does, and calls the neurosurgeon when seconds matter.