A genomics engine in Rust
Memory is a contract.
Most variant callers spend memory that grows with your data, so “will this finish on my machine?” is something you find out the hard way. Rosalind declares the RAM you have, proves the job fits before it runs, never silently OOM-kills you, and emits a receipt anyone can reproduce byte-for-byte.
cargo install rosalind-bio
# the crate is rosalind-bio; the installed binary is `rosalind`
# or grab a prebuilt binary — no toolchain (macOS · Linux):
curl -fsSL https://raw.githubusercontent.com/logannye/rosalind/main/install.sh | sh
# the crate is rosalind-bio; the installed binary is `rosalind`
# or grab a prebuilt binary — no toolchain (macOS · Linux):
curl -fsSL https://raw.githubusercontent.com/logannye/rosalind/main/install.sh | sh
The contract, in four verbs
DeclareState the RAM you have.
Predict
plan reads the index header — no run — and tells you if it fits.Honor
--enforce refuses up front or fails loud. Never a silent OOM.VerifyA BLAKE3 receipt re-checks the run, offline, months later.
Why it's different
- Bounded, predictable memory. Peak tracks local read depth, not file size — and the realized peak is recorded, so the bound is verifiable, not just claimed.
- Byte-for-byte reproducible. Identical inputs → an identical VCF + a content-addressed receipt.
rosalind reproducere-derives a result offline — something a non-deterministic caller can't do. - Honest about uncertainty. Where the evidence is too thin, it abstains instead of guessing — and every claim it makes is a re-runnable check.
See it for yourself
Drag in a receipt → watch tampering get caught (in-browser) Reproduce a result, offline Re-run the claims harness
Honest scope. Today: bounded whole-genome germline SNV calling over a sorted BAM + a portable index,
a reproducible feature substrate for ML, and the verifiable memory contract — single-threaded, with accuracy
measured on simulated diploid truth so far (a real GIAB benchmark is next). Building toward: extending the
same contract to index construction along a sublinear-space (~√t) curve, so “declare your RAM and it's honored”
holds end to end. It's a Rust library + CLI — not a black-box pipeline.