Early access — breast cancer biology

Fewer than half of landmark
cancer studies hold up.

We're fixing that.

Norath Bio scores published cancer biology findings by how well they have been independently reproduced. Starting with breast cancer cell line research — one of the most published domains in cancer biology and a field with a well-documented reproducibility crisis. For the first time, researchers can see which findings are solid — and which have never been verified by a second lab. The data exists. Nobody has connected it. Until now.

For researchers, lab directors, and drug development teams.

<50%

of landmark preclinical cancer studies hold up when independently tested

Reproducibility Project: Cancer Biology, eLife 2021

47 of 53

landmark cancer studies failed Amgen's internal replication attempt — after 10 years of trying

Begley & Ellis, Nature 2012

2%

of cancer experiments make their raw data publicly accessible

Reproducibility Project: Cancer Biology, eLife 2021

The data to answer this question already exists across millions of published papers. Norath connects it.

The platform

A reproducibility score for every finding in breast cancer biology.

Norath ingests published breast cancer papers, extracts each finding as a structured record, and maps the citation relationships between studies to identify which findings have been independently replicated — and which haven't. Every finding gets a score. Every score will have a reason.

This is not a literature search engine. PubMed already exists. This is a verification layer — the missing infrastructure between published and proven.

01

Ask a question. Get a real answer.

Type a plain English query — 'Show me all BRCA1 findings in breast cancer cell lines' — and get back a ranked list of specific findings with their experimental metadata and replication record, not just a list of papers to go read yourself.

02

Every finding, scored 0–100.

Each finding is scored based on how many independent labs attempted to reproduce it, how many succeeded, how large the original study was, and whether effect sizes were reported. Green means confirmed. Red means one lab found it once and nobody ever checked.

03

The contradictions nobody connected.

A 2022 paper that quietly failed to replicate a 2018 finding. A result that has been cited 340 times but contradicted twice. Norath surfaces the relationships the literature has never made visible — because the cost of missing them is enormous.

04

Get notified when the ground shifts.

Set an alert on any gene, drug, or finding. When a new paper affects the reproducibility score of something you care about, you'll know immediately — not months later when the damage is done. This feature is currently in development and will be available to early access users first.

Who uses it

Built for everyone who can't afford to be wrong.

Researchers & lab directors

Know which foundational findings your work depends on are solid before you spend three years building on them. Run due diligence before every grant application.

Highest ROI

Drug development teams

Evaluate target validity before committing $50M in development resources. Monitor your foundational assumptions in real time as new papers are published.

Research funders

Identify where the highest-citation, lowest-reproducibility findings are concentrated. Allocate funding toward verification where it matters most.

Current coverage

Starting with breast cancer.

We begin with breast cancer cell line research — one of the most published domains in cancer biology and a field with a well-documented reproducibility crisis. Every paper. Every finding. Indexed and ready.

5,663
Papers indexed
5,629
Findings extracted
2010–2024
Years covered
All cancer types
Coming next

Why now

This couldn't have been built two years ago.

Correctly classifying whether a citation represents a genuine replication attempt — versus a passing mention or a building-on — requires understanding the biology, not just the text. That kind of domain-aware extraction at scale wasn't feasible at reasonable cost until large language models arrived.

Norath was founded by Patrick Callahan, a cell and molecular biology graduate and practicing data engineer. The biological judgment that makes our scores accurate — knowing that n=3 in a cell culture study is not the same as n=300, that HEK293 cells and human embryonic kidney cells are the same thing, that a supporting citation and a genuine replication are completely different — is baked into the platform at the extraction layer.

That's not something you can bolt on. It's the whole thing.

Science only moves forward on evidence.

We're building our early access list now. Research labs and drug development teams get priority access when we launch.