
Inside the $50M mission to fix clinical evidence
with Brigham Hyde, Atropos Health
Inside the $50M mission to fix clinical evidence
Show Notes
Only 14% of daily medical decisions are backed by high-quality evidence. Brigham Hyde will tell you that is actually one of the higher estimates. He has spent his career at the intersection of clinical pharmacology, data science, and healthcare technology, and what he found when he looked at the evidence gap was not a data problem - it was an automation problem. The data exists. Hundreds of millions of patient timelines have been accumulating in electronic health records for 20 years. The bottleneck is the conversion step: turning raw data into peer-reviewed-quality evidence quickly enough to matter at the point of care.
Atropos Health was born out of Stanford, where Brigham's co-founder Dr. Haw built the original "green button" - press it and run a real-time study on patient data. What used to take a research team two to five months now takes minutes. The company has 894 million patient timelines across its evidence network, has produced over 100,000 novel studies, and is now deploying Chat RWD - a generative AI interface that lets clinicians ask questions and get evidence-backed answers in the time it takes to type a text message. This is not AI replacing doctors. It is AI giving doctors the evidence they have never had before.
The Evidence Gap and Why It Exists
Clinical trials are expensive, and most are funded by pharmaceutical companies with specific commercial interests - which means they focus on narrow populations and the healthiest patients to avoid confounding effects. The result: 70% of existing trials exclude patients with comorbidities like diabetes, obesity, and heart disease, which describes roughly 60-70% of actual US patients. The evidence base that physicians rely on was largely generated on what Brigham calls "white male triathletes at Memorial Sloan Kettering" - not the patients in most exam rooms.
Atropos addresses this by using observational research - the real-world data in EHRs - to generate evidence at scale. The hard part is not the data; it is doing the analysis correctly. Causal inference methods, propensity score matching, and other statistical techniques are required to isolate the effect of a treatment from the confounding variables in messy real-world data. Do it wrong and you publish findings that do not replicate or, worse, lead clinicians to the wrong conclusion. Atropos automates the rigorous methodology so that high-quality evidence can be produced in minutes rather than months, at a cost fraction of a traditional clinical trial.
5 Frameworks from This Episode
1. The Data-to-Evidence Conversion Model
- Having data is not the same as having evidence - the gap between them is the correct application of statistical methodology (causal inference, propensity matching, bias controls)
- A study done incorrectly on the right data will produce the wrong conclusion and may not pass peer review - which means it cannot inform clinical decisions
- Atropos automates the methodology, not just the analysis - it produces studies that meet the evidentiary standards physicians are trained to use
- The history of medicine moved from having no data (pre-1970s) → having bad data → having good data → now needing to convert good data to evidence at scale: that is the current step
2. The Federated Data Model
- Atropos never moves patient data - the technology is brought to the data at the health system where it is stored and appropriately stewarded
- This solves the silo problem without creating new privacy or liability risks: the data never leaves the institution, but the analysis can cross 20+ data sets and 894 million patient timelines
- Cloud computing made this possible - health data is no longer locked in hospital basement servers; it is on private and hybrid clouds where computation can happen
- For any startup building on sensitive data (health, financial, legal), the federated model is worth examining: the value is in the analysis layer, not in owning the data
3. The Shift in Glass Theory
- Each major technology platform shift is a "shift in glass" - where your eyeballs go: laptops to phones (iPhone), browsers to apps (App Store), and now apps to agents
- If an agent can go hit all your apps in the background and return the answer, why would you ever open a browser or app again?
- In healthcare: doctors today log into Epic, look up papers, type notes. The agentic shift means they talk to an agent, the agent handles documentation (ambient AI), and calls specialized evidence agents in the background
- Stanford's agentic tumor board is an early real deployment: clinicians talk in a Teams chat, invite specialized agents to interpret images or pull evidence, without leaving their workflow
4. The Jury Approach to AI Consensus
- No single LLM is right about everything - the emerging best practice is to ask multiple models the same question and measure consensus (or divergence) across answers
- Atropos published a benchmark of 3,000 clinical questions run through ChatGPT, Claude, Gemini, Perplexity, and their own service - measuring "answered with evidence" as the quality metric
- The startup opportunity: build the orchestration layer that knows which model or agent to call for which part of a domain-specific workflow, and how to synthesize consensus back to the user
- Big tech cannot easily dominate this layer because it requires vertical domain knowledge - this is where niche expertise becomes a durable competitive advantage
5. Drug Repositioning - Finding New Uses for Proven Drugs
- Many drugs that are already approved and proven safe in humans may work for diseases they were never tested against - but identifying those matches historically required a 15-year drug development process
- With real-world data at scale, you can run a study asking: in patients who took this drug for one condition, did the rate of a second unrelated condition go up or down?
- Brigham's gout example: Stanford researcher Dylan Dodd found two antibiotics had opposite effects on gout via gut microbiome; Atropos confirmed it in millions of patients in a single day
- The FDA's fast-track pathway for orphan drugs points toward where this could go: if a drug is proven safe and we can generate strong real-world evidence of efficacy quickly, why wait 15 years?
- Every Cure (founded by David Feigenbaum, who used this approach to find a cure for his own rare disease) is advocating for this model at the policy level
Founder Experiment: Map Your Domain's Evidence Gap
Step 1 - Identify the decisions in your domain that are made on instinct rather than data. In healthcare it is 86% of clinical decisions. In your vertical - legal, finance, operations, education - what percentage of important decisions are backed by rigorous evidence vs. experience and convention? The gap is the market.
Step 2 - Find where the data already exists but is not being converted into evidence. Atropos did not create new patient data; it automated the methodology to convert existing EHR data into actionable studies. In your domain, what data is being collected but never analyzed in a way that informs decisions?
Step 3 - Benchmark current AI tools against your domain's quality standard. Run 20-30 representative questions from your vertical through ChatGPT, Claude, Gemini, and any specialist tools. Score each answer against your domain's quality criteria (not just "does it sound right?"). This is your competitive landscape map and your product gap in one exercise.
Step 4 - Design the federated version of your data model. Before building a data aggregation business, ask: can you bring computation to the data instead of moving the data? Federated models reduce regulatory friction, accelerate enterprise sales, and create network effects without the liability of owning sensitive records.
Step 5 - Identify a "green button" moment in your vertical. At Stanford, the green button was: press here, get a study. What is the single most valuable on-demand output your users would use daily if it were available instantly? Build that one thing first, prove it works, then expand - exactly as Atropos did from Stanford to 894 million patient timelines.
Glossary
Tools & Resources Mentioned
Q&A
Why does only 14% of medical decisions have high-quality evidence backing them?
Clinical trials are expensive and mostly funded by pharmaceutical companies with narrow commercial interests, which means they target specific populations and exclude the healthiest patients to avoid confounding. The result: 70% of existing trials exclude patients with comorbidities like diabetes, obesity, and heart disease - the conditions that describe 60-70% of actual US patients. The evidence base was built on an unrepresentative slice of humanity. The rest of medical decision-making falls into what Brigham calls the art of medicine: a physician's training, experience, and judgment, unaided by a relevant study.
What is the core technical insight behind Atropos Health?
The bottleneck was never the data - it was the methodology. Taking EHR data and generating a peer-review-quality study requires causal inference techniques (propensity score matching, confounder control, bias detection) that are time-consuming and technically demanding. Atropos automated that methodology so studies that previously took research teams two to five months to produce can now be generated in minutes. The innovation is not data access; it is the automation of scientific rigor at scale.
What happened when a Stanford neurologist used Atropos on a 12-year-old with distal nerve pain?
The neurology team suspected early-onset MS, which would have required a spinal tap and extensive imaging - a frightening and expensive workup for a child. One team member used Atropos to query what had happened to similar patients in the Stanford data. The study returned in four hours (now minutes): 85% of the roughly 300 matching cases had a latent viral infection within the prior two weeks. The team asked the family - yes, the child had been sick recently. They gave corticosteroids, the nerve pain resolved within 24 hours, the family avoided a $30,000 procedure, and the child went home in two days. Win for patient, physician, and healthcare cost simultaneously.
How does Atropos handle the privacy and silo problems in health data?
Atropos uses a federated model: the technology is brought to the data at each health system rather than the data being moved to a central warehouse. Patient timelines never leave the institution where they are stored. This sidesteps the privacy and regulatory complexity of data aggregation while still allowing analysis across 20+ data sets and 894 million patient timelines. Cloud computing made this feasible - health data is no longer in hospital basement servers but on private and hybrid clouds where computation can happen in place.
What is Chat RWD and how does it work?
Chat RWD is Atropos's generative AI interface built on a RAG (Retrieval-Augmented Generation) framework over their evidence library. Clinicians type natural-language questions - the same way they would text a colleague - and the system surfaces relevant studies with an 'answered with evidence' badge (green if directly supported, yellow or red if evidence is partial or missing). Atropos tested their system alongside ChatGPT, Claude, Gemini, and Perplexity across 3,000 clinical questions; the benchmark measures whether the answer is backed by a citable study, not just whether it sounds plausible.
What is the drug repositioning opportunity and why does it matter?
Thousands of drugs that are already approved and proven safe in humans may be effective against diseases they were never studied for - but without a study connecting them, clinicians cannot prescribe them for those uses and the FDA cannot approve the new indication. With real-world data, you can ask: in patients who took drug X for condition A, did the rate of condition B go up or down? Brigham's example: Stanford researcher Dylan Dodd found two antibiotics had opposite effects on gout via gut microbiome shifts. Atropos confirmed the finding across millions of patients in one day. The 15-year drug approval timeline drops to weeks if the drug is already proven safe.
What does the agentic AI shift look like inside a hospital today?
Stanford is deploying Microsoft's agentic orchestration platform with an early use case: a tumor board that runs in Teams chat. Clinicians discuss cases in the chat and can invite specialized agents - an imaging agent to read scan dimensions, an evidence agent (Atropos) to pull relevant clinical studies for a specific patient profile, a documentation agent to handle note-writing. Nobody has to log into the EHR, look up papers in a separate system, or switch contexts. Brigham also described Atropos running passively in the background during visits: absorbing the doctor-patient conversation, identifying the key clinical decisions, and surfacing relevant evidence before the physician even types a question.
Why is the evidence gap a startup opportunity rather than just a healthcare policy problem?
Because lowering the cost of generating studies creates a business, not just a public good. Life science companies currently spend two to five months and large teams to produce a single real-world evidence study for drug development. Atropos does it in a day. That automation has direct commercial value - faster R&D cycles, cheaper evidence generation, earlier signal on drug efficacy. On the health system side, every avoided unnecessary procedure (like the spinal tap) is a cost saving. The evidence gap is a policy failure that is also a massive inefficiency that technology can monetize while fixing.
What is Brigham's view on AI replacing physicians?
He is not building a physician replacement - he is building the evidence layer that physicians have never had. His framing: if you give a well-trained physician a peer-reviewed study, they know exactly what to do with it. The problem is there are not enough studies. Atropos generates those studies on demand. The physician still makes the clinical judgment; the AI provides the evidentiary foundation that was previously missing. The parallel to AI in other professions: the tool does not replace the expert's judgment, it improves the information environment that judgment operates in.