Inside the $120M Incyte–Genesis deal, the competitive landscape, and where the white space is for a new entrant.
On May 20, 2026, Incyte (Nasdaq: INCY) and Genesis Molecular AI announced a major expansion of their strategic collaboration—one of the first pharma-AI partnerships to feed large-scale foundation model training with a partner’s proprietary experimental data.
Genesis’ GEMS platform (Genesis Exploration of Molecular Space) includes foundation models for protein-ligand structure and property prediction, backed by investors including a16z and NVIDIA. The deal creates what Evan Feinberg, Genesis CEO, calls an “industrial-scale flywheel of AI-enabled design-make-test cycles.”
“High-quality proprietary data is among the most valuable inputs for advancing molecular AI. This expanded collaboration will enable both companies and patients to benefit from an industrial-scale flywheel.”
— Evan Feinberg, Ph.D., Founder & CEO, Genesis Molecular AI
The deal builds on an initial collaboration from February 2025. Incyte will share proprietary experimental data with Genesis to enhance the models—a critical differentiator. Incyte retains exclusive rights to develop and commercialize all resulting compounds, plus receives recurring research funding for compute workloads. Additional programs beyond the initial five could yield “several billion dollars” in additional milestone payments.
The AI drug discovery market is at an inflection point. Multiple data sources converge on a consistent story: rapid growth from a relatively small base.
Global pharma R&D deals hit $86.7B in 2025—up 49% YoY—with AI driving a shift toward fewer, larger, more targeted partnerships averaging $1.16B each (IQVIA).
The Incyte–Genesis deal joins a cascade of nine- and ten-figure AI pharma partnerships. Companies are concentrating capital on AI-driven platforms that appear to offer a better probability of clinical success, rather than spreading bets across many smaller collaborations.
| Partnership | Value | Year | Focus |
|---|---|---|---|
| Sanofi × Exscientia | $5.2B | 2022 | Oncology, immunology (15 targets) |
| Isomorphic × Eli Lilly | $1.7B | 2024 | Small-molecule discovery (multi-target) |
| Isomorphic × Novartis | $1.2B | 2024 | Small-molecule discovery (3 targets) |
| NVIDIA × Eli Lilly | $1.0B | 2025 | Next-gen AI research lab |
| AstraZeneca × CSPC Pharma | $5.3B | 2025 | AI platform-driven pipeline |
| Incyte × Genesis | $120M+ ($1B milestones) | 2026 | Foundation model + proprietary data flywheel |
| Almirall × Absci | $650M (milestones) | 2026 | AI-designed biologics for dermatology |
| Company | Approach | Wet Lab? | Key Milestones |
|---|---|---|---|
| Recursion Public / RXRX |
Phenomics + computer vision foundation models; merged with Exscientia (2024) | Yes—millions of experiments/week, NVIDIA BioHive-2 supercomputer | $1.6B accumulated R&D spend; most comprehensive AI drug discovery stack |
| Isomorphic Labs Alphabet / Private |
AlphaFold-derived structure prediction; small-molecule design | Minimal—primarily computational, partners provide validation | $2.7B raised (Series B from Thrive Capital); $3B in pharma deals with Lilly + Novartis + J&J |
| Insilico Medicine Private |
Generative AI (GANs); Pharma.AI platform; end-to-end from target to clinic | Yes—LifeStar 2 automated lab; MMAI Gym for Science | Rentosertib: first AI-designed drug to reach Phase IIa; 30 preclinical candidates at 12-18 month pace |
| Absci Public / ABSI |
Generative AI + synthetic biology for de novo antibody design | Yes—screens billions of cells/week; AI-to-validated-candidate in 6 weeks | Deals with AstraZeneca, Almirall ($650M), AMD ($20M strategic); ABS-201 in Phase 2 |
| Genesis Molecular AI Private |
GEMS foundation models for protein-ligand prediction; design-make-test loops | Via partner (Incyte data) | $120M Incyte deal; backed by a16z, NVIDIA |
| ChemLex Private |
Self-driving chemistry lab; 24/7 autonomous synthesis | Yes—core differentiator is physical automation | $45M raised (Dec 2025); 70+ customers including 6 of top 10 pharma |
| Chan Zuckerberg Biohub Non-Profit |
ESM protein world models; open-source AI for protein biology | Partner labs provide validation | ESM generation 4 models launched May 2026; validated in cancer + immune targets |
Sanofi, Novartis, Bayer, GSK, Eli Lilly, and AstraZeneca have all committed to deep AI integration. Lilly’s strategy is emblematic: a $1B lab partnership with NVIDIA, a $1.7B Isomorphic deal, and a $350M Innovent collaboration—all within 18 months.
Google/DeepMind (via Isomorphic and AlphaFold), Microsoft (Novartis alliance, BioGPT), NVIDIA (BioNeMo platform, compute partnerships), and AWS (cloud lab infrastructure) are all building the picks-and-shovels layer. The CZ Biohub’s open-source ESM4 models, released in late May 2026, could democratize protein understanding in the same way AlphaFold did for structure prediction.
This is where the real startup opportunity lives. The biggest lesson from the last decade of AI drug discovery is stark: computational predictions without physical validation are insufficient.
AI models are only as good as the data they train on. The companies winning the biggest deals—Genesis, Recursion, Absci—all share one trait: they generate proprietary experimental data at scale. This creates a compounding advantage:
This is what Genesis calls “an industrial-scale flywheel of AI-enabled design-make-test cycles.” Companies without wet lab capabilities can’t close this loop.
Recursion collaborates with HighRes Biosolutions on self-driving, high-throughput labs using robotic perception, digital twins, and natural language-driven lab orchestration. ChemLex runs a 24/7 autonomous chemistry system that compresses months of synthesis into days. Emerald Cloud Lab pioneered the cloud-accessible lab concept with Carnegie Mellon.
The convergence of robotics, AI, and miniaturized biology means the cost of running a wet lab experiment is dropping exponentially—while the value of the data it produces is increasing exponentially. This is the core startup opportunity.
| Opportunity | Gap | Why Now |
|---|---|---|
| Lab-as-a-Service for AI Biotechs | Most AI-native companies (Genesis, Isomorphic) lack their own wet lab. They depend on partners or CROs that aren’t designed for rapid iteration. | AI-first biotechs need 10x faster turnaround than traditional CROs provide |
| Automated Assay Development | Setting up new biological assays is still artisanal. Each target requires custom protocols, cell lines, and readouts. | LLMs can now read protocols and suggest optimizations; robotics costs have dropped 60% in 5 years |
| Data-Quality-as-a-Service | Pharma companies have petabytes of experimental data but it’s siloed, inconsistently formatted, and hard to use for model training. | Foundation models require standardized, high-quality training data—there is no good middleware for this |
| Niche Therapeutic Area Vertical | Big players target cancer and immunology. Rare diseases, neglected tropical diseases, and agricultural biotech are underserved. | Smaller data requirements for rare diseases; regulatory incentives (orphan drug status); less competition |
| Biologics / ADC Design Platform | Most AI drug discovery focuses on small molecules. Antibody-drug conjugates (ADCs) and biologics are a $300B+ market with limited AI tooling. | Absci proved the model works; the ADC market is exploding (10+ new approvals since 2023) |
| AI-Native Contract Research | Traditional CROs (Covance, Charles River) are slow to adopt AI. A born-digital CRO could compress timelines 5-10x. | The design-make-test cycle is the bottleneck; whoever runs it fastest wins |
Build AI models, license them to pharma partners. Revenue = upfront payments + milestones + royalties. Pros: Capital-efficient, scales well, pharma absorbs clinical risk. Cons: You don’t own the drugs; value capture is capped by royalty rates (typically 1-5%).
Run physical experiments on behalf of clients using your AI + automated lab stack. Revenue = fee-for-service + data licensing. Pros: Recurring revenue from day one; you accumulate proprietary data from every experiment. Cons: Capital-intensive (lab buildout); operational complexity.
Use your platform for partner-funded programs (generates cash) while advancing your own drug candidates (captures upside). Pros: Partners de-risk early operations; internal pipeline captures full drug value. Cons: Requires significantly more capital; execution risk on two fronts.
Build and operate automated wet labs purpose-built to generate high-quality biological data. Sell data and trained models, not drugs. Pros: Avoids clinical trial risk entirely; every customer’s experiments make your models better. Cons: Harder to command premium pricing; depends on network effects materializing.
Start with Model B or D. The AI-first CRO or Data Flywheel model lets you generate revenue and proprietary data from day one, without requiring $100M+ to fund an internal drug pipeline. Once you have traction, you can selectively advance internal programs (evolving toward Model C) using the data advantage you’ve built.
ChemLex reached 70+ customers including 6 of the top 10 pharma companies in just 3.5 years on $45M. That velocity is instructive.
The clinical valley of death is real. As of late 2024, no end-to-end AI-designed drug has demonstrated clear Phase II efficacy. Insilico’s Rentosertib showed early signs, but the field is still awaiting a definitive proof point. This matters even for non-pipeline companies because pharma’s willingness to pay premium prices depends on demonstrated clinical impact.
Build an automated wet lab optimized for AI-speed iteration. Serve AI-first drug discovery companies (Genesis, Isomorphic, smaller biotechs) who have great models but no physical lab. Also serve mid-size pharma who want to experiment with AI-driven discovery without building internal capability.
| Phase | Timeline | Focus | Capital |
|---|---|---|---|
| Seed | Months 0-6 | Pick a niche modality (e.g., peptide therapeutics or ADCs). Build an MVP automated assay pipeline. Sign 2-3 design partners. | $3-5M |
| Series A | Months 6-18 | Scale the lab. Build the data layer (structured experimental results as a product). Hit 10+ paying customers. | $15-25M |
| Series B | Months 18-36 | Launch proprietary AI models trained on your accumulated data. Optionally start 1-2 internal drug programs using your own data advantage. | $40-80M |
| Growth | Year 3+ | Platform licensing + internal pipeline + CRO flywheel = the Absci/Recursion playbook, but starting from revenue instead of from $1.6B in burn. | Revenue-funded + opportunistic raises |
San Francisco / South San Francisco (AI talent + biotech ecosystem density), Boston / Cambridge (pharma proximity + Kendall Square network), or Singapore (ChemLex’s playbook: government subsidies, access to Asian pharma, growing hub). Salt Lake City (Recursion’s BioHive ecosystem) is an emerging dark horse with lower costs.
The insight that makes this work: the limiting reagent in AI drug discovery is not compute or algorithms. It’s high-quality, standardized, rapid-turnaround experimental data. If you own the fastest path from AI prediction to physical validation, you own the bottleneck. Every AI company is a potential customer.