Early access — oncology drug development

The live evidence layer
for drug discovery.

Drug programs fail in Phase 2 because they're built on preclinical findings that were never independently verified. Norath automatically synthesizes 97,000+ indexed papers, human genetic association evidence, and clinical trial history into a complete target validation assessment — scoring every foundational finding, surfacing every hidden contradiction, and flagging every high-citation assumption that has never been independently tested. What takes a senior scientist 2–4 weeks, Norath generates in 60 seconds.

97,000+ papers indexed. 15 cancer types. Integrates Open Targets genetic evidence and ClinicalTrials.gov trial data alongside our proprietary preclinical replication database.

For drug development teams, researchers, and lab directors.
Free for individual academic researchers. Paid plans for labs and drug development teams.

<50%

of landmark preclinical cancer studies replicate when independently tested

Reproducibility Project: Cancer Biology, eLife 2021

47 of 53

landmark cancer studies failed Amgen's internal replication attempt — after 10 years and hundreds of millions spent

Begley & Ellis, Nature 2012

$150–300M

average cost of a Phase 2 oncology trial failure. Most are built on foundational findings that were never independently verified before program initiation.

The data to answer this question already exists across millions of published papers. Norath connects it.

The platform

Not a report you run once. The system your target validation workflow runs through.

This is not a literature search engine. PubMed already exists. This is the verification infrastructure that sits inside the most consequential decision in drug development — should we bet $150 million on this target?

Norath is where drug development teams track the evidence quality of every active program — from target identification through IND filing. The evidence base shifts as new papers are published. Programs initiated on solid evidence can develop contradictions. Unreplicated findings can get confirmed. Norath is the live view of where the science stands — updated continuously as new evidence enters the literature.

When you're deciding whether to initiate

Target Premise Report

Enter any gene or drug target. In 60 seconds, Norath generates a structured evidence assessment integrating three data sources simultaneously: human genetic association scores from Open Targets, active and discontinued clinical trials from ClinicalTrials.gov, and the complete preclinical replication record from our database of 97,000+ indexed papers.

Every foundational finding scored. Every contradiction surfaced. Every high-citation unverified assumption flagged. Exportable as a PDF for target validation committees.

This report currently takes 2–4 weeks of senior scientist time to assemble manually. Norath generates it in 60 seconds.

When you're monitoring an active program

Evidence Alerts

A target validation team watches a program for 2–3 years as new evidence accumulates. Every new paper that confirms or contradicts a foundational finding changes the program's risk profile.

Set an alert on any gene, drug, or finding. When new published evidence changes the replication score of something your program depends on, you'll know immediately. Available to early access users first.

When you're doing due diligence

Retrospective Failure Analysis

For any discontinued Phase 2 or Phase 3 oncology program, Norath shows the replication quality of the preclinical evidence that existed before the program was initiated — using only evidence available at the time of the decision.

In program after program, the pattern is the same: the foundational preclinical findings that justified hundreds of millions in clinical investment had never been independently verified. Average replication score on foundational evidence for failed programs: 26 out of 100.

When your scientists need to know what to trust

Finding Intelligence

Query any gene, drug, pathway, or cell line in plain English. Get back every finding in our database — ranked by replication score, with the complete evidence landscape for each claim: which labs confirmed it, which contradicted it, which findings extended it into other cancer types, and which high-citation papers have never been independently tested.

The cost of one wrong bet

$150–300M

The average Phase 2 oncology trial failure. Most programs are initiated on foundational preclinical findings that have never been independently verified. One avoided failure pays for decades of Norath access.

Without Norath

2–4 weeks

A translational scientist manually searches PubMed, reads abstracts and full texts, chases citations through reference lists, consults colleagues, and builds a summary document for the target validation committee. Manual citation chasing alone can take days for a single foundational claim.

Based on 20–50 target evaluations per year, large pharma companies spend $150K–$750K annually in senior scientist time on this workflow alone.

With Norath

60 seconds

Enter a target and Norath generates the complete Target Premise Report — integrating Open Targets genetic evidence, ClinicalTrials.gov trial history, and the full preclinical replication record. Structured, scored, with every contradiction surfaced and every unverified assumption flagged. Exportable as PDF for the validation committee.

Retrospective validation

We ran Norath against discontinued Phase 2 and Phase 3 oncology programs.

Across KRAS, PARP, CDK4/6, BRAF, and EGFR targets, the pattern is the same: the foundational preclinical findings that justified hundreds of millions in clinical investment were unreplicated at the time the programs were initiated. Average foundational evidence score across analyzed programs: 26 out of 100.

Norath would have flagged every one of them.

Run the analysis yourself →

Who uses it

Built for everyone who can't afford to be wrong.

Researchers, PhD programs & lab directors

Know which foundational findings your work depends on are solid before you spend years building on them. Verify the evidence base before committing to a research direction. Run due diligence before every grant application.

Drug development teams

The average Phase 2 oncology trial failure costs $150-300 million. Most programs are initiated on foundational preclinical findings that have never been independently verified. Norath generates the Target Premise Report for every program initiation decision — structured, auditable, and built from the complete replication record of every finding your program depends on. One avoided failure pays for decades of access.

Research funders

Identify where the highest-citation, lowest-reproducibility findings are concentrated. Allocate funding toward verification where it matters most.

Enterprise

Private data integration

Norath's extraction pipeline runs on any structured scientific document — not just published papers. Pharmaceutical and biotech teams can connect their internal preclinical data — electronic lab notebooks, unpublished study reports, IND packages — to generate a unified evidence view that combines proprietary findings with the public replication record.

Example: A biotech has run 40 internal replication attempts on their lead target over three years — none published. That data sits in their ELN system disconnected from the published literature. Norath ingests those internal studies, classifies them using the same pipeline, and overlays them on the public evidence graph. The result: a unified evidence view combining what the field has published with what your labs have actually seen. The most complete target assessment that has ever existed for that program.

Contact for enterprise access and data integration partnerships →

Where we're going

From evidence tool to industry standard.

Each phase builds the foundation the next one requires. The goal is not a useful tool — it's the standard infrastructure the entire industry runs its target decisions through.

Live now

Phase 1

The evidence graph

—97,000+ indexed papers
—Claim-level replication scoring
—Target Premise Reports
—Retrospective failure analysis
—Open Targets + ClinicalTrials.gov integration

Phase 2

Live monitoring

—Evidence alerts on active programs
—Score updates as new papers publish
—Citation graph visualization
—Team collaboration and report history

Phase 3

Predictive intelligence

—Translational prediction models
—Scoring across experimental model hierarchies
—Disease area expansion

Phase 4+

Industry infrastructure

—Enterprise private data overlay
—Regulatory integration

Phase 1

Live now

The evidence graph

—97,000+ indexed papers
—Claim-level replication scoring
—Target Premise Reports
—Retrospective failure analysis
—Open Targets + ClinicalTrials.gov integration

Phase 2

Live monitoring

—Evidence alerts on active programs
—Score updates as new papers publish
—Citation graph visualization
—Team collaboration and report history

Phase 3

Predictive intelligence

—Translational prediction models
—Scoring across experimental model hierarchies
—Disease area expansion

Phase 4+

Industry infrastructure

—Enterprise private data overlay
—Regulatory integration

Current coverage

The complete oncology evidence graph.

We index the full preclinical evidence stack across all major solid tumor cancer types — cell line studies, animal models, patient-derived systems, organoids, biomarker studies, and combination therapy research. Every paper. Every finding. Every citation relationship. Indexed, structured, scored, and growing every week.

97,000+

Papers indexed

107,000+

Findings extracted

Cancer types

2010–2026

Years covered

270+

Replication signals

60+

Surfaced contradictions

Neuroscience expansion — Alzheimer's and Parkinson's — in development.

Why now

This couldn't have been built two years ago.

Correctly classifying whether a citation represents a genuine replication attempt — versus a passing mention or a building-on — requires understanding the biology, not just the text. That kind of domain-aware extraction at scale wasn't feasible at reasonable cost until large language models arrived.

Norath was built by a cell and molecular biology graduate and data engineer — someone who understood both the science and the infrastructure needed to connect it at scale.

The biological judgment that makes our scores accurate is not bolted on. It is the foundation. Knowing that a replication only counts if another lab ran the same experiment in a comparable system and got the same result. That a supporting citation and a genuine replication are completely different things. That BM2 cells and MDA-MB-231 cells are not the same experimental system even though they share a lineage. That n=3 in a cell culture study means something completely different depending on whether those are biological or technical replicates.

That precision is baked into every extraction and every classification at the model level. A competitor cannot replicate this by pointing a general-purpose AI at PubMed. It required someone with the biology background to know what questions to ask and the engineering background to build the infrastructure to answer them at scale. The corpus, the citation graph, and the private data integrations create compounding data moats. Each new paper ingested, each new citation classified, each new private dataset connected makes the evidence graph richer and harder to replicate from scratch.

The pharmaceutical industry initiates thousands of drug programs every year. Every one of them rests on preclinical findings that were never systematically verified before hundreds of millions were committed. Norath is the infrastructure that changes that — not for one program, not for one company, but as the standard that the entire industry runs its target decisions through.