You handle hypotheses, judgments, and decisions. Your agents handle literature recon, simulations, code, documentation, and figures.
AI agents can produce a paper a day. Most of those papers won't reproduce. ASRP encodes the scientific method into the workflow itself.
of researchers have failed to reproduce another scientist's experiment. (Nature, 2016)
say there is a significant reproducibility crisis in their field — and AI is making it worse.
errors caught in one day in the iDEA case study below — before submission, not months after.
Every research project starts with the same 7-phase workflow. Bootstrap phases run on AI minutes — you steer immediately. Phase 7 then switches to "1 human day = 1 AI hour" for deep work.
Theorist Q&A in Discord
Web · arXiv · self-search
Synthesise + critique
User chooses path
Engineer feasibility
Time + budget lock-in
Execute · review · iterate
Each role has its own SOUL prompt, model class, skill manifest, and access permissions. The agent that proposes a hypothesis is never the one that signs off on it.
Lead scientist and the only user-facing voice. Owns hypotheses, literature reconnaissance, opportunity synthesis, plan construction, and the active loop.
Implements code, runs experiments, processes data — and independently recomputes every numerical result Theorist produces, using a different path.
Dispatcher, daily standup author, and red-team critic. Posts phase kickoffs that @mention Theorist and Engineer. Never mentions itself — bots don't receive self-mention events.
No new chat app to learn. ASRP spins up an embedded OpenClaw gateway and connects each agent as a real Discord bot — you talk to them in #general just like a teammate.
Every research project gets its own channel. Phase kickoffs are posted automatically. Daily standups summarize what shipped, what broke, and what's next.
These aren't best practices you have to remember. They're encoded in the workflow itself.
Register hypotheses with falsification criteria before running anything. The Researches Registry is your immutable baseline — no post-hoc storytelling.
Engineer must reproduce every numerical result via a different code path. Divergence triggers a Reviewer red-team pass before anything goes into the paper.
Every decision, every data point, every error correction logged forever. Full reproducibility from hypothesis to conclusion — exportable as CSV.
Every SOUL prompt instructs the agent to question assumptions, verify definitions, and trust data over authority. Self red-team pass before any deliverable ships.
Right model for the right task. Opus for reasoning, Sonnet for code, Haiku for dispatch. Daily budget caps with live cost tracking on the dashboard.
The proposer is never the validator. Reviewer red-teams every deliverable. SRW-v3 enforces sender ≠ mention target as a runtime invariant.
Auto-detect installed dependencies, batch-install what's missing, and keep each agent's capability profile in sync — pip, brew/apt/winget, cargo, clawhub, or guided manual install.
mpmath · NumPy · SciPy · SymPy · pandas · Numba · scikit-learn
Lean 4
Tectonic · nano-pdf
arxiv API · opendataloader-pdf
github · summarize · himalaya (email)
Git · tqdm · JupyterLab · jq · python3-venv
5 install types · pip · brew / apt / winget · cargo · clawhub · manual — all auto-detected
Every paper project lives as a directory tree with authors, sections, references, and stage gates. Three sample paper projects ship out of the box.
Outline + scaffold
Citations & anchors
Reproducible recipe
Tables, figures, data
Reviewer red-team
Export & archive
A built-in question bank across four disciplines. Hand them to your ASRP team and measure how a real research workflow stacks up against single-shot prompting.
Each discipline ships with 99 graded problems — easy warm-ups through grad-level open questions. Search, filter by difficulty, and route any problem straight into a new research project.
Race-guarded loading, debounced search, accessible keyboard nav, and a Local toggle so you can run the whole bank against your on-prem Ollama stack.
ASRP works fully offline with Ollama. Hardware detection, streaming model pulls, GPU/VRAM aware. Or deploy headless on a VPS — same desktop, no display required.
Detects RAM, GPU, VRAM and recommends models you can actually run.
One-click ollama pull with live progress and resume.
API keys live in the OS keychain. Local SQLite + JWT auth.
Auto-detects missing $DISPLAY and switches to headless mode.
Bring your own keys for Anthropic, Google, or OpenRouter. Or run everything locally with Ollama. Mix and match per agent — Theorist on Opus, Engineer on a local Qwen, Reviewer on Haiku.
Self-test suite (25 checks) verifies the install end-to-end. Auto-updater handles new releases.
A real ASRP run: a one-day DFT investigation that self-corrected twice and retracted itself before submission.
No CLI needed. Install the desktop app, complete the 5-step Setup Wizard, and start your first research project.
Get the installer for macOS, Windows, or Linux. Auto-updater takes care of the rest.
5 steps: profile, API keys, agent config, Discord bots, launch. ~10 minutes.
One click on the dashboard spins up the embedded OpenClaw gateway and your 3 bots.
@mention Theorist with your research question. The team takes it from there.
Inspect every line. Self-host on your terms. Shape the future of AI-augmented research.
Every line of code is auditable on GitHub. See exactly how agents make decisions and where your data flows.
Run ASRP on your laptop, your lab workstation, or a headless VPS. Your research data never leaves your network.
Bring your own LLM. Anthropic, Google, OpenRouter, or local Ollama — swap freely per agent.
Contribute SOUL templates, skill manifests, benchmarks, and domain workflows that benefit every researcher.