I Just Deployed AutoDock Vina to Google Cloud Run — From a Single Desktop
- mansour ansari

- Jan 6
- 4 min read
Happy New Year 2026

After 6 weeks of focused work, I successfully deployed a production-grade molecular docking pipeline (AutoDock Vina + RDKit/OpenMM stack) to Google Cloud Run, backed by Google Cloud Storage and exposed through a clean API. This is a must-have for anyone who needs to dock compounds at scale without GPUs or Enterprise Hardware, or who uses Kubernetes and/or a DevOps team!
Yes, this was built without:
A DevOps team
Kubernetes
Persistent VMs
GPUs
Enterprise hardware
The entire system was developed from a Windows 10 Pro desktop, running Ubuntu (WSL2) on a modest quad-core Dell machine with 16 GB RAM. This is an old PC in my back office.
Why this matters to anyone reading this:
Why Molecular Docking Workloads Are a Poor Fit for Traditional Infrastructure
1. Bursty — not continuous
Docking workloads do not behave like web servers or databases.
You don’t dock molecules 24/7 at a steady rate.
You submit large batches, wait for results, then go idle.
Activity comes in waves:
Prepare proteins
Submit thousands of ligands
Wait
Analyze results
Pause or change parameters
Repeat
On a traditional VM:
You pay even when nothing is running
Idle CPUs and RAM burn money
Scaling up or down is manual or slow
Cloud Run excels here because:
It spins up containers only when a job arrives
Scales to many instances during a burst
Scales down to zero cost when idle ➡️ Docking is episodic by nature — Cloud Run matches that rhythm.
2. Compute-heavy — but not GPU-dependent
Docking is computationally intense, but in a very specific way:
Heavy floating-point math
Large conformational search spaces
Repeated scoring and energy evaluation
CPU-bound, not GPU-bound (for most Vina workflows)
This means:
Each job may consume significant CPU for minutes
But jobs are independent
Perfectly parallelizable
On local machines:
You quickly saturate all cores
Everything else slows down
Overnight jobs become multi-day jobs
On VMs:
You must provision for peak demand
Most of the time that capacity sits unused
Cloud Run allows:
Multiple isolated CPU-heavy workers
Automatic parallelism
Short-lived compute bursts without long-term allocation➡️ You pay for actual math performed, not idle silicon.
3. Traditionally painful to scale
Historically, scaling docking meant:
Buying more hardware
Installing Linux
Compiling scientific libraries
Fighting dependency conflicts
Managing SSH access
Writing job schedulers
Debugging “works on my machine” issues
Even in the cloud:
Kubernetes adds operational overhead
VMs require patching, monitoring, security updates
Scaling requires DevOps expertise
With Cloud Run:
The entire environment is captured in a Dockerfile
Dependencies are frozen and reproducible
Scaling is automatic
Deployment is one command
Rollbacks are instant
➡️ Scaling docking stops being an infrastructure problem and becomes a science problem again.
4. Poorly suited to long-running VMs
Long-running VMs are optimized for:
Databases
APIs with constant traffic
Stateful services
Docking jobs are:
Stateless
Disposable
Restartable
Failure-tolerant
Input/output driven
VM drawbacks for docking:
You pay for uptime, not productivity
A crashed VM loses work
You must manage storage manually
Horizontal scaling is clumsy
Cloud Run containers:
Start fresh every time
Fail cleanly
Retry safely
Store inputs/outputs in GCS
Leave no residue
➡️ Docking wants ephemeral compute, not permanent servers.
Why This Architecture Changes the Game
By combining:
Cloud Run (compute)
Docker (reproducibility)
GCS (storage)
Browser-based orchestration
You get:
A lab that scales on demand
No server babysitting
No idle costs
No hardware lock-in
No geographic constraints
This is why the effort was intense.
Available for Private Deployment & Integration
I’m available to:
Privately deploy this pipeline in your Google Cloud project
Adapt it to your ligand/protein prep workflow
Integrate with browser-based tools, notebooks, or AI pipelines
Help teams move from local docking to cloud-scale docking
Engagements are by negotiation, depending on scope and integration depth.
If you’re building in:
Drug discovery
Computational chemistry
AI-driven molecular screening
Research automation
…and want infrastructure that actually works, feel free to reach out.
Minimal Python Sanity Check (Browser-Friendly)
Below is a very small Python script that verifies the Cloud Run docking service is alive and responding.This can be run from:
A local machine
Jupyter Notebook
Google Colab
Any browser-backed Python environment
import requests
# Replace with your deployed Cloud Run URL
BASE_URL = "https://something like vina-worker-palpmpusfa-uc.b.run.app"
def check_health():
url = f"{BASE_URL}/health"
r = requests.get(url, timeout=10)
r.raise_for_status()
return r.json()
def check_openapi():
url = f"{BASE_URL}/openapi.json"
r = requests.get(url, timeout=10)
r.raise_for_status()
return r.json()
if __name__ == "__main__":
print("Checking service health...")
print(check_health())
print("\nChecking OpenAPI schema...")
schema = check_openapi()
print("Service title:", schema.get("info", {}).get("title"))
print("Available endpoints:", list(schema.get("paths", {}).keys()))
Expected output:
Health check returns: {"status": "ok"}
OpenAPI schema lists endpoints such as:
/health
/dock
/dock-smiles
/docs
Once this passes, the system is ready for real docking jobs.
Final Thought on this...
This was labor-intensive, occasionally frustrating, and absolutely worth it. it turns decades-old, trusted docking software into a modern, scalable, cloud-native service, without compromising scientific integrity.
Old science. New leverage. Real results. If you want help deploying something like this correctly, I’m happy to talk.
— Mansour, January 2026



Comments