Deep learning with memory safety
RUMUS is a native-Rust deep learning framework that satisfies the borrow checker while delivering PyTorch-like ergonomics. Zero-cost abstractions, GPU acceleration, and compile-time safety guarantees.
cargo add rumusWhy RUMUS?
Most deep learning frameworks achieve safety through runtime reference counting. RUMUS enforces memory safety as a first-class language constraint — checked at compile time, not runtime.
Memory Safe by Design
Rust's borrow checker enforces memory safety at compile time. No runtime reference counting, no data races, no use-after-free — guaranteed by the type system.
Zero-Cost Abstractions
View operations like reshape and transpose are metadata-only — zero memory allocation. Inference mode completely bypasses the autograd tape with no overhead.
GPU Acceleration
WGPU compute backend with 40+ shader entry points. Per-resource fences eliminate global pipeline stalls. Buffer pooling with power-of-2 bucketing recycles GPU memory.
PyTorch-Like Ergonomics
Familiar eager execution model. Define-by-run autograd. Module system with #[derive(Module)] proc macro. If you know PyTorch, you already know RUMUS.
Concrete Autograd
BackwardOp is a concrete enum with 16 variants — not opaque closures. Every backward operation is inspectable, Send + Sync safe, and deterministic.
Safe Serialization
Safetensors format with zero unsafe code via bytemuck. Save and load model state dicts with dot-path naming, just like PyTorch's state_dict().
Three orthogonal layers
The Tensor is not a junk drawer. RUMUS strictly partitions internal state into three independent layers, each with a single responsibility. This makes the framework auditable, predictable, and easy to extend.
Storage Layer
Raw memory management with CPU/GPU unified addressing, version tracking, and per-resource fences
Layout Layer
Shape, strides, and view semantics — reshape and transpose are zero-allocation metadata operations
Autograd Layer
Gradient tracking with append-only Wengert tape, Kahn's algorithm backward pass, and concrete BackwardOp enum
Familiar and expressive
If you know PyTorch, you already know RUMUS. Define models with structs, derive the Module trait, and train with eager execution.
use rumus::nn::{self, type">Linear, type">Module};
use rumus::optim::type">Adam;
use rumus::autograd;
use rumus::type">Tensor;
"token-attribute">#[derive(type">Module)]
struct Net {
fc1: type">Linear,
fc2: type">Linear,
}
impl Net {
fn new() -> type">Self {
type">Self {
fc1: type">Linear::new(784, 128),
fc2: type">Linear::new(128, 10),
}
}
fn forward(&self, x: &type">Tensor) -> type">Tensor {
let h = nn::relu(&self.fc1.forward(x));
self.fc2.forward(&h)
}
}
fn main() -> type">Result<(), Box<dyn std::error::Error>> {
let model = Net::new();
let mut opt = type">Adam::new(model.parameters(), 0.001);
for epoch in 0..100 {
let pred = model.forward(&inputs);
let loss = nn::cross_entropy_loss(&pred, &targets);
let mut grads = autograd::backward(&loss)?;
opt.step(&mut grads)?;
println!("Epoch {epoch}: loss = {:.4}", loss.item());
}
nn::save_safetensors(&model.state_dict(""), "model.safetensors")?;
type">Ok(())
}Complete training ecosystem
Everything you need to define, train, and deploy deep learning models — from autograd to GPU-fused optimizers.
Layers & Modules
Linear, Conv2d, MaxPool2d, Flatten, Dropout — with automatic parameter collection via #[derive(Module)].
Optimizers
SGD with momentum, Adam, and AdamW with decoupled weight decay. All GPU-fused for zero host-device round-trips during training.
Loss Functions
MSE and Cross-Entropy with Log-Sum-Exp numerical stability. Gradients are pre-computed in the forward pass for efficiency.
Training Loop
Trainer struct with closure-based train_step(), automatic epoch loss tracking, and BufferPool memory recycling on Drop.
Ready to get started?
Add RUMUS to your Rust project and start building memory-safe deep learning models today.