Deep Origin awarded $31.7M ARPA-H contract to replace animal testing with in-silico models. Learn more

Docking

The best docking in the field. Now in plain English, in your IDE, or in your browser.

Pose prediction and virtual screening that beat AutoDock Vina, Glide, DiffDock, and AlphaFold 3 on the benchmarks that matter — and keep working on novel targets where most tools fall apart.

87.5% Hit enrichment in top 1% (DEKOIS 2.0)
>90% PoseBusters success rate
50B Molecules screenable in days
10x Fewer wet-lab experiments vs. Vina
Three ways to use it

Pick the interface that fits your workflow

The same docking engine, three on-ramps. From a chat with Balto, to a Python notebook, to a full visual workbench.

Conversational

Balto AI

Load a protein, find pockets, and dock small molecules in plain English. The world's best docking, finally made easy — no command line, no scripts.

  • 30 dockings/month free
  • $0.10 per additional dock
  • Visual pose & score analysis
Try Balto
Visual workbench

DO Studio

A unified UI for medicinal chemists. Run docking alongside FEP, ADMET, and structure prep — without stitching together IT, compute, and licensed software.

  • Browser-based, zero install
  • Integrated data & compute
  • Built for medicinal chemists
Request a demo
Code-first

Deep Origin API

A Python client for full programmatic control. Pose prediction and virtual screening at billion-molecule scale, no DevOps required.

  • Python client with 50+ tools
  • Active learning for billion-scale screens
  • Production docs & tutorials
Get API access
The science

Benchmarks where we top the field

Three benchmarks the community treats as the bar — pose accuracy, virtual screen enrichment, and generalization to novel targets.

Pose prediction

PDBbind core set

% of 285 protein-ligand complexes predicted within 2 Å RMSD.

Deep Origin
69%
AutoDock Vina
50%
DiffDock
39%
DOCK 6
38%
Virtual screening

DEKOIS 2.0 — top 1% enrichment

Average % of true binders captured in the top 1% of ranked molecules across 81 targets.

Deep Origin
87.5%
SCORCH2
75%
IGN
25%
Glide SP
16.67%
RFScore-VS
12.5%
AutoDock Vina
8.33%
Generalization

Runs N' Poses — novel targets (0–20% Tanimoto)

The space where novel programs live. Performance on systems with low chemical similarity to the training set — where most ML docking models collapse.

DO Dock
55%
AutoDock Vina
22%
AlphaFold 3
18%
Boltz-1
9%
Chai
5%

All benchmarks use train/test splits filtered at 30% protein sequence identity and 0.5 Tanimoto similarity (2048-bit RDKit fingerprints). Without these filters, published ML docking numbers tend to inflate dramatically — DiffDock's 38% drops to 15%, for example.

Prospective validation

Benchmarks are easy. Real targets are the test.

Where docking actually has to work — hard targets, prior screens that failed, and binders that other tools missed.

CD73

Immuno-oncology target where prior screens had failed

~80B compounds screened. Of 160 tested, 48 active below 80 µM and 9 below 10 µM, with broad chemotype diversity.

JAK2 Pseudokinase

Hard target where Glide SP/XP and GOLD ChemScore struggled

Recovered filgotinib (an approved drug known to bind the JAK2 PK domain) and other true binders missed by other tools.

DPP4

2,830 experimentally-validated binders hidden in 100,000 molecules

Every molecule in the top 170 was a true binder. Zero false positives. 363-fold enrichment.

KRAS G12D

Notoriously undruggable target

13 of 16 known binders surfaced in the top 20, with Mirati's MRTX1133 ranked at the top.

How it's built

What makes the docking work

Physics-informed ML, not pure black box

Our docking blends physics-based scoring with ML where ML genuinely helps. The result is a model that holds up on novel chemistry — not just the chemistry it was trained on.

Rigorously held-out benchmarks

We split train and test sets at 30% protein sequence identity and 0.5 Tanimoto similarity. Many published numbers (DiffDock's 38% drops to 15% under the same filter) inflate when these guardrails aren't applied.

Built for billion-molecule libraries

Active learning lets us screen 50 billion unenumerated Enamine REAL Space molecules in days. Trillion-molecule libraries are next.

End-to-end with FEP

Docking poses feed directly into our ABFE pipeline for binding affinity refinement — no manual handoff, no pose curation, no separate tools to license.

Ready to put it to work?

The fastest way in is Balto — load a protein, dock a ligand, see the score in a single conversation. For programmatic or large-scale work, talk to us about API access or DO Studio.