Architecture Overview
A warehouse robot picks orders from shelves. Its camera captures 30 frames per second, each frame 6 megabytes. A vision model detects items. A planner computes a path around obstacles. A motor controller executes the path at 1 kHz. A safety monitor watches for collisions.
These components have fundamentally different timing requirements — the safety monitor must never be late, the vision model needs GPU time, the planner needs CPU time, and the motor controller needs sub-millisecond predictability. They also share large data: the camera frame must reach the vision model without being copied, and the planner's velocity commands must reach the motor controller in nanoseconds, not milliseconds.
HORUS solves this with four primitives: Nodes (isolated components), Topics (zero-copy communication), a Scheduler (priority-based orchestration), and a Memory System (shared memory pools for large data).
How It Works
Nodes
Everything in HORUS is a Node — an independent component with a well-defined lifecycle. Each node implements tick(), which the scheduler calls repeatedly:
// simplified
fn tick(&mut self) {
if let Some(sensor_data) = self.sensor_topic.recv() {
let command = self.compute_response(sensor_data);
self.command_topic.send(command);
}
}
The tick model enables deterministic timing (know exactly when each node runs), profiling (measure how long each tick takes), and scheduling intelligence (the scheduler can optimize execution order).
Topics
Nodes communicate through Topics — named shared-memory channels. You always use the same Topic::new() call; HORUS automatically selects the fastest backend based on topology:
// simplified
let topic: Topic<Image> = Topic::new("camera.image")?;
topic.send(&frame); // publisher
let frame = topic.recv(); // subscriber (another node)
| Backend | Latency | When selected |
|---|---|---|
| Same-thread | ~14 ns | Publisher and subscriber on same thread |
| Same-process | ~82–182 ns | Same process, different threads |
| Cross-process | ~162–165 ns | Different processes, shared memory |
No configuration needed — the backend upgrades transparently as participants join or leave.
Scheduler
The scheduler orchestrates node execution with priority-based ordering, five execution classes, deadline monitoring, and graceful degradation:
| Feature | Purpose |
|---|---|
| Execution order | .order(n) — lower runs first each tick |
| 5 execution classes | BestEffort, Rt, Compute, Event, AsyncIo — auto-detected from node config |
| Deadline enforcement | .budget(), .deadline(), .on_miss() — graduated response to overruns |
| Watchdog | Detects frozen nodes with graduated degradation (warn → reduce rate → isolate → safe state) |
| BlackBox | Flight recorder for post-crash forensics |
Memory System
Large data (images, point clouds, tensors) uses shared memory pools for zero-copy transfer:
// simplified
let mut img = Image::new(1920, 1080, ImageEncoding::Rgb8)?;
camera.capture_into(img.data_mut());
let topic: Topic<Image> = Topic::new("camera.rgb")?;
topic.send(&img); // zero-copy — only a descriptor crosses the ring buffer
Data Flow Example
A typical perception-to-action pipeline:
Total message-passing latency: under 1 microsecond (same-process backends).
Design Decisions
Why nodes instead of functions? Robotics systems need isolation. A crashing camera driver shouldn't bring down the motor controller. Nodes provide fault boundaries — the scheduler can isolate a failing node while the rest of the system continues. Functions in a monolith share a call stack and a single point of failure.
Why a single Topic API instead of separate in-process and cross-process APIs?
During development, you run everything in one process. In production, you split across processes for isolation. If the communication API changed between these modes, you'd have to rewrite code for deployment. The single Topic::new() API means the same code works in both modes — HORUS selects the optimal backend automatically.
Why a tick model instead of threads? Threads are hard to reason about: priority inversion, lock contention, non-deterministic scheduling. The tick model gives the scheduler full control over execution order. For nodes that genuinely need their own thread (RT, Compute, AsyncIo), the scheduler creates one — but it manages the lifecycle, not the application.
Why shared memory instead of message passing?
A 4K RGB image is 24 MB. Serializing, copying, and deserializing it through a socket costs milliseconds. Shared memory costs nanoseconds — the subscriber reads directly from the publisher's memory. For small messages (CmdVel, Imu), the difference is less dramatic but still 300x faster than DDS.
Trade-offs
| Gain | Cost |
|---|---|
| Sub-microsecond IPC via shared memory | Single-machine only — no built-in cross-network transport |
| Deterministic execution order via scheduler | All interacting nodes must be in the same scheduler (or use cross-process topics) |
| Zero-copy large data (images, point clouds) | Fixed-size ring buffers — must choose capacity at topic creation |
| Automatic backend selection (in-process vs SHM) | Can't force a specific backend — the system chooses |
| Five execution classes for different workload types | More complex scheduler internals |
| Node isolation with fault boundaries | Node crashes still lose that node's state — no automatic restart by default |
Performance Summary
| Metric | Value |
|---|---|
| Same-thread topic | ~14 ns |
| Same-process topic | ~82–182 ns |
| Cross-process topic | ~162–165 ns |
| Scheduler tick overhead | ~50–100 ns |
| Shared memory allocation | ~100 ns |
| Framework memory overhead | ~2 MB |
See Benchmarks for exact measured numbers.
See Also
- What is HORUS? — Overview and positioning
- Nodes (Concept) — Deep dive into the node model
- Topic (Concept) — Communication architecture
- Scheduler (Concept) — Tick loop and execution classes
- Execution Classes — The 5 classes and when to use each