Multi-Process Architecture
HORUS topics work transparently across process boundaries. Two nodes in separate processes communicate the same way as two nodes in the same process — through shared memory. No broker, no serialization layer, no configuration.
To orchestrate multiple processes with session discovery, control routing, and e-stop propagation, see Launch System.
// simplified
// Process 1: sensor.rs
let topic: Topic<Imu> = Topic::new("imu")?;
topic.send(imu_reading);
// Process 2: controller.rs
let topic: Topic<Imu> = Topic::new("imu")?; // same name = same topic
if let Some(reading) = topic.recv() {
// Got it — zero-config, sub-microsecond
}
How It Works
When you call Topic::new("imu"), HORUS creates (or opens) a shared memory region. Any process on the same machine that calls Topic::new("imu") with the same type connects to the same underlying ring buffer. The shared memory backend is managed by horus_sys — you never configure paths manually.
HORUS auto-detects whether a topic is same-process or cross-process and picks the fastest path:
| Scenario | Latency | How It Works |
|---|---|---|
| Same thread | ~3ns | Direct pointer handoff |
| Same process, 1:1 | ~18ns | Lock-free single-producer/single-consumer ring buffer |
| Same process, 1:N | ~24ns | Broadcast to multiple in-process subscribers |
| Same process, N:1 | ~26ns | Multiple in-process publishers, one subscriber |
| Same process, N:N | ~36ns | Full many-to-many in-process |
| Cross-process, POD type | ~50ns | Zero-copy shared memory (no serialization) |
| Cross-process, N:1 | ~65ns | Shared memory, multiple publishers |
| Cross-process, 1:N | ~70ns | Shared memory, multiple subscribers |
| Cross-process, 1:1 | ~85ns | Shared memory, serialized type |
| Cross-process, N:N | ~91ns | Shared memory, contention-free fan-out |
Cross-process adds ~30-130ns vs in-process — still sub-microsecond. You don't configure any of this. The backend is selected automatically based on topology and upgrades transparently as participants join or leave.
Running Multiple Processes
Option 1: horus run with Multiple Files
# Builds and runs both files as separate processes
horus run sensor.rs controller.rs
# Mixed languages work too
horus run sensor.py controller.rs
# With release optimizations
horus run -r sensor.rs controller.rs
horus run compiles each file, then launches all processes and manages their lifecycle (SIGTERM on Ctrl+C, etc.).
Option 2: Separate Terminals
Run each node in its own terminal:
# Terminal 1
horus run sensor.rs
# Terminal 2
horus run controller.rs
Topics auto-discover via shared memory. No coordination needed.
Option 3: horus launch (YAML)
For production, declare your multi-process layout in a launch file:
# launch.yaml
nodes:
- name: sensor
cmd: horus run sensor.rs
- name: controller
cmd: horus run controller.rs
- name: monitor
cmd: horus run monitor.py
horus launch launch.yaml
Example: Two-Process Sensor Pipeline
Process 1 — sensor.rs:
// simplified
use horus::prelude::*;
message! {
WheelEncoder {
left_ticks: i64,
right_ticks: i64,
timestamp_ns: u64,
}
}
struct EncoderNode {
publisher: Topic<WheelEncoder>,
ticks: i64,
}
impl EncoderNode {
fn new() -> Result<Self> {
Ok(Self {
publisher: Topic::new("wheel.encoders")?,
ticks: 0,
})
}
}
impl Node for EncoderNode {
fn name(&self) -> &str { "Encoder" }
fn tick(&mut self) {
self.ticks += 10;
self.publisher.send(WheelEncoder {
left_ticks: self.ticks,
right_ticks: self.ticks + 2,
timestamp_ns: horus::now_ns(),
});
}
}
fn main() -> Result<()> {
let mut sched = Scheduler::new().tick_rate(100_u64.hz());
sched.add(EncoderNode::new()?).order(0).build()?;
sched.run()?;
Ok(())
}
Process 2 — odometry.rs:
// simplified
use horus::prelude::*;
message! {
WheelEncoder {
left_ticks: i64,
right_ticks: i64,
timestamp_ns: u64,
}
}
struct OdometryNode {
encoder_sub: Topic<WheelEncoder>,
odom_pub: Topic<Odometry>,
last_left: i64,
last_right: i64,
}
impl OdometryNode {
fn new() -> Result<Self> {
Ok(Self {
encoder_sub: Topic::new("wheel.encoders")?,
odom_pub: Topic::new("odom")?,
last_left: 0,
last_right: 0,
})
}
}
impl Node for OdometryNode {
fn name(&self) -> &str { "Odometry" }
fn tick(&mut self) {
if let Some(enc) = self.encoder_sub.recv() {
let dl = enc.left_ticks - self.last_left;
let dr = enc.right_ticks - self.last_right;
self.last_left = enc.left_ticks;
self.last_right = enc.right_ticks;
println!("[Odom] delta L={} R={}", dl, dr);
}
}
}
fn main() -> Result<()> {
let mut sched = Scheduler::new().tick_rate(100_u64.hz());
sched.add(OdometryNode::new()?).order(0).build()?;
sched.run()?;
Ok(())
}
Run them:
# Terminal 1
horus run sensor.rs
# Terminal 2
horus run odometry.rs
The WheelEncoder messages flow through shared memory at ~50ns latency, with zero configuration.
When to Use Multi-Process
| Factor | Single Process | Multi-Process |
|---|---|---|
| Latency | ~3-36ns (intra-process) | ~50-171ns (cross-process) |
| Determinism | Full control via scheduler ordering | Each process has its own scheduler |
| Isolation | A crash takes down everything | A crash is contained to one process |
| Languages | Single language per binary | Mix Rust + Python freely |
| Restart | Must restart everything | Restart one process independently |
| Debugging | Single debugger session | Attach debugger to one process |
| Deployment | One binary to deploy | Multiple binaries |
| Complexity | Simpler | More moving parts |
Use single-process when:
- All nodes are the same language
- You need deterministic ordering between nodes (e.g., sensor → controller → actuator)
- Latency matters at the nanosecond level
- Simpler deployment is preferred
Use multi-process when:
- Mixing Rust and Python (e.g., Rust motor control + Python ML inference)
- Process isolation is needed (safety-critical separation)
- Independent restart required (update one node without stopping others)
- Different update rates or lifecycle requirements
Introspection
HORUS CLI tools work across processes automatically:
# See all topics (from any process)
horus topic list
# Monitor a topic published by another process
horus topic echo wheel.encoders
# See all running nodes across processes
horus node list
# Check bandwidth across processes
horus topic bw wheel.encoders
Cleaning Up
Shared memory files persist after processes exit. Clean them with:
horus clean --shm # Remove stale shared memory regions
In practice, you rarely need this — HORUS automatically cleans stale SHM on every horus CLI command and every Scheduler::new() call. The manual command is an escape hatch for debugging.
What Happens When a Process Crashes
When a process dies (even via SIGKILL or power loss):
- SHM files persist — the kernel closes the file descriptor and releases
flocklocks, but the mmap'd file stays on disk - Other processes continue — subscribers see
dropped_count()increase if the publisher was mid-write, but they don't crash - Backend auto-migrates — when the crashed process restarts and reconnects, the topic detects the new participant and migrates the backend (e.g., from 1:1 to 1:N) within ~10μs
- Automatic cleanup — the next
horusCLI command orScheduler::new()call auto-cleans stale namespaces (<1ms). No manual intervention needed.
# Process 1 crashes
# Process 2 keeps running, reading stale data from the ring buffer
# Process 1 restarts
# Process 2 sees fresh data again — no reconfiguration needed
Type mismatches: If a restarted process changes its message type (e.g., from CmdVel to Twist), the join fails with an error. Both processes must use the same message type for the same topic name.
Mixed-Language Multi-Process
The most common multi-process pattern: Rust for control loops, Python for ML inference.
Rust sensor node (sensor.rs):
// simplified
use horus::prelude::*;
struct CameraNode {
pub_img: Topic<Image>,
}
impl CameraNode {
fn new() -> Result<Self> {
Ok(Self { pub_img: Topic::new("camera.rgb")? })
}
}
impl Node for CameraNode {
fn name(&self) -> &str { "camera" }
fn tick(&mut self) {
let mut img = Image::new(640, 480, "rgb8");
// ... capture from hardware ...
self.pub_img.send(img);
}
}
fn main() -> Result<()> {
let mut sched = Scheduler::new().tick_rate(30_u64.hz());
sched.add(CameraNode::new()?).order(0).build()?;
sched.run()
}
Run both:
# Terminal 1 (Rust, 30 FPS camera)
horus run sensor.rs
# Terminal 2 (Python, ML inference)
horus run detector.py
The Image flows through shared memory pool-backed transport — the Python node gets a zero-copy view of the pixels the Rust node wrote. No serialization, no copying.
Debugging Multi-Process Systems
Identify which process owns what
horus topic list --verbose
# Shows publisher/subscriber PIDs per topic
horus node list
# Shows all running nodes across all processes with PID, rate, CPU, memory
Watch cross-process data flow
# Monitor messages from the Rust sensor in the Python process's terminal
horus topic echo camera.rgb
# Measure the actual publishing rate
horus topic hz camera.rgb
# Measure bandwidth
horus topic bw camera.rgb
Debug one process at a time
# Start the sensor normally
horus run sensor.rs
# Start the detector with verbose logging
RUST_LOG=debug horus run detector.py
Use the monitor for a system-wide view
horus monitor
# Web UI at http://localhost:3000 shows ALL nodes from ALL processes
# Topic graph view shows cross-process message flow
Common debugging workflow
horus topic list— verify both processes see the same topicshorus topic hz <topic>— verify the publisher is sending at expected ratehorus topic echo <topic>— verify message content is correcthorus node list— verify both nodes areRunning(notErrororCrashed)horus bb --anomalies— check for deadline misses or errors
Common Errors
| Error | Cause | Fix |
|---|---|---|
| Topics not visible across processes | Different SHM namespaces | Set HORUS_NAMESPACE=shared in both terminals, or use horus launch |
Type mismatch on topic join | Process A uses CmdVel, Process B uses different type for same name | Ensure both processes use the exact same message type |
| Stale data after crash | SHM files persist after process death | Usually auto-cleaned on next horus run. Manual: horus clean --shm |
Topic not found in CLI | CLI uses a different namespace than the running app | Run CLI in same terminal or set matching HORUS_NAMESPACE |
High dropped_count | Subscriber process is slower than publisher | Increase subscriber rate, reduce publisher rate, or increase topic capacity |
| Permission denied on SHM | Different users running processes | Run both as the same user, or check /dev/shm permissions |
Design Decisions
Why Auto-Discovery via Shared Memory Names
When you call Topic::new("imu") in two separate processes, both connect to the same shared memory region because the topic name deterministically maps to a shared memory path (managed by horus_sys). There is no registration step, no discovery protocol, and no configuration file listing topic endpoints. This works because shared memory is a kernel-level namespace — any process on the same machine that opens the same named region gets the same memory. Auto-discovery eliminates an entire class of misconfiguration bugs ("I forgot to register my topic") and means processes can start and stop in any order.
Why No Broker Process
Message brokers (like DDS in ROS2 or MQTT) add a routing hop between every publisher and subscriber. Even with optimizations, this hop adds latency and creates a single point of failure — if the broker crashes, all communication stops. HORUS uses direct shared memory: publishers write to a ring buffer, subscribers read from it, and no intermediary process routes messages. This gives sub-microsecond latency and means there is no central process that can fail. The cost is that HORUS topics only work on a single machine (cross-machine communication requires an explicit bridge).
Why Transparent Same-Process vs Cross-Process Selection
HORUS automatically detects whether a publisher and subscriber are in the same process or different processes and selects the fastest transport: direct pointer handoff (~3ns) for same-thread, lock-free ring buffer (~18ns) for same-process, or shared memory (~50ns) for cross-process. Users write the same Topic::new("name") call regardless. This means code that works in a single-process prototype deploys to a multi-process production system with zero changes. The transport upgrades and downgrades transparently as participants join and leave — splitting a monolith into separate processes does not require code changes.
Trade-offs
| Area | Benefit | Cost |
|---|---|---|
| Auto-discovery | Zero configuration; processes connect by topic name alone; start/stop in any order | No explicit topology — harder to audit which processes are connected without horus topic list |
| No broker | Sub-microsecond latency; no single point of failure; no extra process to deploy | Single-machine only — cross-machine communication requires an explicit network bridge |
| Transparent transport | Same code works in single-process and multi-process; zero migration cost | Users cannot force a specific transport backend; automatic selection may surprise during debugging |
| Process isolation | One crash does not take down the system; independent restart and upgrade | Higher baseline latency (~50ns cross-process vs ~3ns same-thread); shared memory files persist after exit and need cleanup |
| Shared memory persistence | Fast reconnection — no handshake needed when a process restarts | Stale files from crashes are auto-cleaned on next startup; horus clean --shm for manual override |
| Independent schedulers | Each process can run at its own tick rate with its own ordering | No cross-process deterministic ordering — sensor-to-actuator chains across processes depend on timing, not scheduler order |
See Also
- Shared Memory — SHM architecture, ring buffers, platform differences
- Topics (Full Reference) — topic API and backend details
- Message Performance — POD types and zero-copy transport
- Multi-Language — Rust + Python interop patterns
- CLI Reference —
horus launch,horus topic list,horus clean