Architecture Overview
HORUS is built on four foundational concepts that work together to create a high-performance robotics runtime:
The Node Model
Everything in HORUS is a Node. A node is an independent unit of computation with a well-defined lifecycle.
Why Nodes?
Robotics systems are inherently modular. A robot has sensors, actuators, planners, and controllers - each with different timing requirements and failure modes. By making each component a node, HORUS provides:
- Isolation - A failing camera driver doesn't crash your motion controller
- Composability - Mix and match nodes to build different robots
- Testability - Test each node independently before integration
- Reusability - Share nodes across projects via the package registry
Node Lifecycle
Every node follows the same lifecycle, ensuring predictable behavior:
| State | What Happens |
|---|---|
| Uninitialized | Node exists but hasn't started |
| Initializing | Setting up resources, connecting to hardware |
| Running | Actively processing - tick() called each cycle |
| Paused | Temporarily suspended, can resume instantly |
| Stopping | Cleaning up, releasing resources |
| Stopped | Fully shut down |
| Error | Something went wrong, but recoverable |
| Crashed | Unrecoverable failure |
The Tick Model
Nodes don't run continuously - they tick. Each tick is a discrete unit of work:
fn tick(&mut self) {
// Read inputs
if let Some(sensor_data) = self.sensor_topic.recv() {
// Process
let command = self.compute_response(sensor_data);
// Write outputs
self.command_topic.send(command);
}
}
This model enables:
- Deterministic timing - Know exactly when each node runs
- Profiling - Measure how long each tick takes
- Scheduling intelligence - The scheduler can optimize execution order
Communication
Nodes need to exchange data. HORUS provides a single Topic API that automatically selects the optimal backend based on how many nodes are communicating and whether they're in the same process.
Topic: One API, Automatic Optimization
You always use the same Topic::new() call. HORUS automatically detects the topology and selects the fastest communication path — from ~3ns (same-thread) to ~167ns (cross-process, many-to-many):
// Same API for all communication patterns
let topic: Topic<Image> = Topic::new("camera.image")?;
topic.send(&frame);
// Another node subscribes — same API
let topic: Topic<Image> = Topic::new("camera.image")?;
if let Some(frame) = topic.recv() {
// Process frame
}
No configuration needed — the backend is selected and upgraded transparently as participants join or leave.
Cross-Process Communication
Topics work transparently across process boundaries using shared memory:
Data goes through shared memory with sub-microsecond latency. Simple fixed-size types get an even faster zero-copy path automatically.
The Scheduler
The scheduler is the brain of HORUS. It decides when and how nodes execute.
Why a Scheduler?
Without coordination, nodes would:
- Fight for CPU resources
- Miss real-time deadlines
- Waste cycles waiting for data that hasn't arrived
The HORUS scheduler solves these problems with intelligent orchestration.
Execution Modes
Different applications need different scheduling strategies:
| Mode | Best For | Tick Overhead |
|---|---|---|
| Sequential | Safety-critical, debugging | Minimal |
| Parallel | CPU-heavy workloads | Varies by core count |
Profiling
The scheduler tracks node execution statistics for diagnostics and optimization:
- Runtime Profiler - Tracks how long each node takes (mean, stddev, min/max)
- Node Tiers - Annotate nodes with execution characteristics (UltraFast, Fast, Normal, etc.)
Safety Systems
Real robots need safety guarantees. The scheduler provides:
| Feature | Purpose |
|---|---|
| WCET Monitoring | Detect nodes exceeding time budgets |
| Fault Tolerance | Isolate failing nodes automatically |
| Watchdog Timers | Detect hung nodes |
| Black Box | Flight recorder for post-mortem analysis |
Memory System
Large data (images, point clouds, ML tensors) needs special handling. Copying a 4K image between nodes would destroy performance.
Zero-Copy Design
HORUS uses shared memory pools for large data:
The image data is written once to shared memory. Each subscriber reads directly from the same memory location - no copying.
TensorPool
TensorPool manages shared memory allocation:
// Auto-managed pool via Topic<Tensor>
let topic: Topic<Tensor> = Topic::new("camera.rgb")?;
let handle = topic.alloc_tensor(&[1080, 1920, 3], TensorDtype::U8, Device::cpu())?;
// Write data (only done once)
let data = handle.data_slice_mut()?;
camera.capture_into(data);
// Send through Topic - only a lightweight descriptor is copied, not the image
topic.send_handle(&handle);
TensorPool characteristics:
- Fast allocation (~100ns)
- Automatic reference counting
- Works across processes
- Device-aware descriptors (CPU, future GPU support)
Python Integration
Python nodes share the same memory pool:
import horus
import numpy as np
# Receive tensor from Rust node
tensor = topic.recv()
# Zero-copy numpy view - no data copied!
array = np.array(tensor, copy=False)
# Process with numpy/PyTorch
result = model.predict(array)
Data Flow Example
Here's how these concepts work together in a typical perception-to-action pipeline:
| Connection | Mechanism | Why |
|---|---|---|
| Camera → Detector | TensorPool | Large image, zero-copy |
| Detector → Planner | Topic | Multiple planners might subscribe |
| Planner → Controller | Topic | Monitoring tools can observe |
| Controller → Motors | Topic | Direct pipeline connection |
Total pipeline latency: Under 1 microsecond for message passing (same-process backends).
Performance Summary
| Metric | Value |
|---|---|
| Same-thread topic | ~3 ns |
| Same-process topic | ~18-36 ns |
| Cross-process topic | ~50-167 ns |
| Scheduler tick overhead | ~50-100ns |
| TensorPool allocation | ~100ns |
Design Philosophy
HORUS is built on these principles:
- Nodes are the unit of composition - Build robots by connecting nodes
- Communication is explicit - No hidden data flow, everything goes through Topic
- The scheduler is your friend - Let it optimize; don't fight it
- Zero-copy by default - Large data should never be copied unnecessarily
- Safety is not optional - Fault tolerance, watchdogs, and black boxes are built in
Next Steps
- Quick Start - Build your first HORUS application
- Core Concepts: Nodes - Deep dive into the node model
- Core Concepts: Topic - Advanced pub/sub patterns
- Scheduler Configuration - Tuning for real-time