The best docking in the field. Now in plain English, in your IDE, or in your browser.
Pose prediction and virtual screening that beat AutoDock Vina, Glide, DiffDock, and AlphaFold 3 on the benchmarks that matter — and keep working on novel targets where most tools fall apart.
Pick the interface that fits your workflow
The same docking engine, three on-ramps. From a chat with Balto, to a Python notebook, to a full visual workbench.
Balto AI
Load a protein, find pockets, and dock small molecules in plain English. The world's best docking, finally made easy — no command line, no scripts.
- 30 dockings/month free
- $0.10 per additional dock
- Visual pose & score analysis
DO Studio
A unified UI for medicinal chemists. Run docking alongside FEP, ADMET, and structure prep — without stitching together IT, compute, and licensed software.
- Browser-based, zero install
- Integrated data & compute
- Built for medicinal chemists
Deep Origin API
A Python client for full programmatic control. Pose prediction and virtual screening at billion-molecule scale, no DevOps required.
- Python client with 50+ tools
- Active learning for billion-scale screens
- Production docs & tutorials
Benchmarks where we top the field
Three benchmarks the community treats as the bar — pose accuracy, virtual screen enrichment, and generalization to novel targets.
PDBbind core set
% of 285 protein-ligand complexes predicted within 2 Å RMSD.
DEKOIS 2.0 — top 1% enrichment
Average % of true binders captured in the top 1% of ranked molecules across 81 targets.
Runs N' Poses — novel targets (0–20% Tanimoto)
The space where novel programs live. Performance on systems with low chemical similarity to the training set — where most ML docking models collapse.
All benchmarks use train/test splits filtered at 30% protein sequence identity and 0.5 Tanimoto similarity (2048-bit RDKit fingerprints). Without these filters, published ML docking numbers tend to inflate dramatically — DiffDock's 38% drops to 15%, for example.
Benchmarks are easy. Real targets are the test.
Where docking actually has to work — hard targets, prior screens that failed, and binders that other tools missed.
CD73
Immuno-oncology target where prior screens had failed
~80B compounds screened. Of 160 tested, 48 active below 80 µM and 9 below 10 µM, with broad chemotype diversity.
JAK2 Pseudokinase
Hard target where Glide SP/XP and GOLD ChemScore struggled
Recovered filgotinib (an approved drug known to bind the JAK2 PK domain) and other true binders missed by other tools.
DPP4
2,830 experimentally-validated binders hidden in 100,000 molecules
Every molecule in the top 170 was a true binder. Zero false positives. 363-fold enrichment.
KRAS G12D
Notoriously undruggable target
13 of 16 known binders surfaced in the top 20, with Mirati's MRTX1133 ranked at the top.
What makes the docking work
Physics-informed ML, not pure black box
Our docking blends physics-based scoring with ML where ML genuinely helps. The result is a model that holds up on novel chemistry — not just the chemistry it was trained on.
Rigorously held-out benchmarks
We split train and test sets at 30% protein sequence identity and 0.5 Tanimoto similarity. Many published numbers (DiffDock's 38% drops to 15% under the same filter) inflate when these guardrails aren't applied.
Built for billion-molecule libraries
Active learning lets us screen 50 billion unenumerated Enamine REAL Space molecules in days. Trillion-molecule libraries are next.
End-to-end with FEP
Docking poses feed directly into our ABFE pipeline for binding affinity refinement — no manual handoff, no pose curation, no separate tools to license.
Ready to put it to work?
The fastest way in is Balto — load a protein, dock a ligand, see the score in a single conversation. For programmatic or large-scale work, talk to us about API access or DO Studio.