AutoDiscovery: AI That Asks Its Own Scientific Questions

Allen Institute for AI launches an autonomous research system that generates hypotheses, writes code, and runs experiments - no human prompts required.

Most AI tools wait for you to ask a question. AutoDiscovery asks its own.

The Allen Institute for AI (Ai2) launched AutoDiscovery on February 12, releasing an AI system that autonomously explores scientific datasets - generating hypotheses, writing Python code, running statistical experiments, and using the results to generate new hypotheses. No human prompting required beyond uploading the data.

The tool is now available in AstaLabs, Ai2’s research platform that provides access to over 108 million academic abstracts and 12 million full-text papers.

How It Works

AutoDiscovery operates in a continuous loop. You give it a dataset. It identifies patterns worth investigating, formulates hypotheses in natural language, designs experiments to test them, writes and executes the code, interprets the statistical results, and feeds those findings back into hypothesis generation.

Two technical innovations power the system:

Bayesian Surprise measures how much evidence changes the AI’s beliefs about a hypothesis. The system starts with prior beliefs extracted from its language model knowledge, then updates them based on data. Large belief shifts in either direction - confirmation or disconfirmation - signal discoveries worth attention. Unexpected disconfirmations can be as valuable as confirmations, revealing genuine findings rather than obvious patterns.

Monte Carlo Tree Search (MCTS) navigates the vast space of possible hypotheses. The algorithm balances exploring new ideas against investigating promising leads more deeply. This prevents the system from either getting stuck in one area or wandering aimlessly.

Real Applications

Cancer researchers at Swedish Cancer Institute used AutoDiscovery to analyze breast cancer mutation datasets. The system identified a potential mutual-exclusivity pattern: PIK3CA mutations correlated with lower TP53 mutation frequency than expected by chance - a finding with potential treatment implications for targeted therapies.

Marine ecologists fed it 20 years of ecosystem data. AutoDiscovery uncovered trophic relationships - predator-prey dynamics and food web connections - that would have taken human researchers months to extract manually.

In both cases, the AI-generated findings were independently verified and published in peer-reviewed literature by November 2025, before the public release.

Part of a Larger Shift

AutoDiscovery builds on Theorizer, which Ai2 released in January. While Theorizer synthesizes theories from scientific literature (reading up to 100 papers per query and extracting patterns), AutoDiscovery works with raw data. Together, they represent a shift from AI as research assistant to AI as research participant.

The contrast with current AI tools is stark. ChatGPT and Claude answer your questions. AutoDiscovery generates the questions themselves - then answers them, then asks follow-up questions based on what it learned.

Limitations and Access

AutoDiscovery requires structured datasets. It won’t parse your scribbled lab notebooks or unorganized spreadsheets. The system also needs computational resources - extended discovery runs can take hours.

Ai2 is offering 1,000 free “Hypothesis Credits” to early users through February 28, 2026. After that, the service will require a paid subscription through AstaLabs.

The system runs on Google Cloud Platform infrastructure, with Ai2 positioning it as a way to democratize large-scale scientific inquiry. Smaller research teams without extensive computational resources or data science expertise can now run discovery pipelines that would previously require dedicated staff.

What This Means

AutoDiscovery represents a philosophical shift in how AI might participate in science. The standard model - human asks question, AI provides answer - assumes humans know what to ask. But in complex datasets, the most valuable questions are often the ones researchers didn’t think to pose.

The risk, of course, is p-hacking at scale. An AI that generates thousands of hypotheses and tests them all will inevitably find spurious correlations. Ai2 addressed this by building skepticism into the system - Bayesian Surprise penalizes findings that are exactly what you’d expect, rewarding the unexpected even when it contradicts the AI’s prior beliefs.

Whether this approach produces genuine scientific insight or sophisticated-looking noise will depend on how researchers use it. The tool is powerful. The question is whether scientists will treat its outputs as starting points for investigation or conclusions to be published.

Ai2 is betting on the former. The peer-reviewed publications from early tests suggest they might be right.