Multi-Process Architecture (Python)
HORUS topics work transparently across process boundaries. Two Python processes that use the same topic name connect to the same shared memory region automatically. No broker, no configuration file, no registration step.
# process_a.py — publisher
import horus
def publish_tick(node):
node.send("sensor.temp", {"celsius": 22.5, "location": "motor_1"})
node = horus.Node("temp_pub", pubs=["sensor.temp"], tick=publish_tick, rate=10)
horus.run(node)
# process_b.py — subscriber (separate terminal)
import horus
def monitor_tick(node):
if node.has_msg("sensor.temp"):
data = node.recv("sensor.temp")
print(f"Temperature: {data['celsius']}C at {data['location']}")
node = horus.Node("temp_mon", subs=["sensor.temp"], tick=monitor_tick, rate=10)
horus.run(node)
Run each in its own terminal. They discover each other through shared memory -- no coordination needed.
How Auto-Discovery Works
When you create a topic (via node.send(), node.recv(), or horus.Topic()), HORUS creates or opens a shared memory region keyed by the topic name. Any process on the same machine that uses the same topic name connects to the same underlying ring buffer.
Process A Process B
┌──────────────┐ ┌──────────────┐
│ Node("pub") │ │ Node("sub") │
│ │ │ │
│ send("imu")──┼──┐ ┌───┼──recv("imu") │
└──────────────┘ │ │ └──────────────┘
▼ ▲
┌─────────────────────┐
│ Shared Memory │
│ Ring Buffer: "imu" │
│ (kernel-managed) │
└─────────────────────┘
There is no discovery protocol, no handshake, and no central broker. The shared memory namespace is the discovery mechanism. Processes can start in any order -- a subscriber that starts before its publisher simply sees no messages until the publisher connects.
Running Multiple Python Processes
Separate Terminals
The simplest approach. Run each node file in its own terminal:
# Terminal 1
horus run sensor.py
# Terminal 2
horus run controller.py
# Terminal 3
horus run logger.py
Using horus run with Multiple Files
# Launches both as separate processes, manages their lifecycle
horus run sensor.py controller.py
# Ctrl+C sends SIGTERM to all processes
Using horus launch (Production)
Declare your multi-process layout in a YAML launch file:
# launch.yaml
nodes:
- name: sensor
cmd: horus run sensor.py
- name: controller
cmd: horus run controller.py
- name: logger
cmd: horus run logger.py
horus launch launch.yaml
Using subprocess (Programmatic)
import subprocess
# Start a companion process from within Python
proc = subprocess.Popen(["horus", "run", "sensor.py"])
# Your main process continues
import horus
def controller_tick(node):
if node.has_msg("sensor.data"):
data = node.recv("sensor.data")
node.send("cmd_vel", horus.CmdVel(linear=data["speed"], angular=0.0))
node = horus.Node("controller", subs=["sensor.data"], pubs=[horus.CmdVel], tick=controller_tick, rate=50)
horus.run(node)
Using systemd (Deployment)
For production deployments, run each process as a systemd service:
# /etc/systemd/system/horus-sensor.service
[Unit]
Description=HORUS Sensor Node
After=network.target
[Service]
ExecStart=/usr/local/bin/horus run /opt/robot/sensor.py
Restart=always
RestartSec=1
Environment=HORUS_NAMESPACE=production
[Install]
WantedBy=multi-user.target
sudo systemctl start horus-sensor
sudo systemctl start horus-controller
Set the same HORUS_NAMESPACE across all services so they share the same SHM namespace.
Topic Sharing Across Process Boundaries
Topics in HORUS are identified by name. Any process that uses the same topic name and the same SHM namespace connects to the same ring buffer. This applies to both dict topics and typed topics.
Dict Topics (GenericMessage)
# Process A
node.send("status", {"battery": 85.0, "mode": "autonomous"})
# Process B (separate process, same topic name)
if node.has_msg("status"):
data = node.recv("status") # {"battery": 85.0, "mode": "autonomous"}
Dict topics serialize via MessagePack. Both processes see the same data with no configuration.
Typed Topics
# Process A
node.send("cmd_vel", horus.CmdVel(linear=1.0, angular=0.5))
# Process B
if node.has_msg("cmd_vel"):
cmd = node.recv("cmd_vel") # CmdVel object, zero-copy
print(cmd.linear) # 1.0
Typed topics use zero-copy shared memory. Both processes must use the same message type for the same topic name. A type mismatch raises an error at connection time.
Pool-Backed Types (Image, PointCloud, Tensor)
Large data transfers use pool-backed shared memory. Only a small descriptor (64-168 bytes) travels through the ring buffer; the actual data stays in the shared memory pool.
# Process A — camera capture
import numpy as np
def camera_tick(node):
pixels = capture_frame() # returns numpy array
img = horus.Image.from_numpy(pixels) # copies once into SHM pool
node.send("camera.rgb", img) # sends 64B descriptor, not 1.4MB
# Process B — ML inference
def detect_tick(node):
if node.has_msg("camera.rgb"):
img = node.recv("camera.rgb")
frame = img.to_numpy() # zero-copy view into SHM pool
# Run inference on frame...
A 1080p RGB image (1920x1080x3 = ~6MB) transfers in microseconds because only the descriptor crosses the ring buffer.
Namespaces
By default, each terminal session gets its own SHM namespace (derived from session ID and user ID). This prevents accidental cross-talk between unrelated projects.
To share topics across separate terminals, set the same namespace:
# Terminal 1
HORUS_NAMESPACE=myrobot horus run sensor.py
# Terminal 2
HORUS_NAMESPACE=myrobot horus run controller.py
horus launch automatically sets a shared namespace for all processes in the launch file.
| Scenario | Namespace behavior |
|---|---|
| Same terminal | Auto-shared (same session ID) |
horus run a.py b.py | Auto-shared (same invocation) |
horus launch | Auto-shared (launch sets namespace) |
| Separate terminals | Separate by default. Set HORUS_NAMESPACE to share |
| systemd services | Must set HORUS_NAMESPACE explicitly |
Mixed-Language Systems
The most common multi-process pattern pairs a Python process with processes written in other languages. The shared memory transport is language-agnostic -- any process that uses HORUS topics connects to the same ring buffers.
Example: Motor Controller + ML Inference
A compiled motor controller process runs the real-time control loop at 1kHz. A Python process runs ML inference at 30Hz. Both communicate through shared memory.
# ml_detector.py — Python ML inference process
import horus
import numpy as np
def detect_tick(node):
if node.has_msg("camera.rgb"):
img = node.recv("camera.rgb")
frame = img.to_numpy() # zero-copy from SHM
# Run your ML model
detections = run_yolo(frame)
# Publish results for the controller
if detections:
closest = min(detections, key=lambda d: d["distance"])
node.send("obstacle", {
"distance": closest["distance"],
"angle": closest["angle"],
"class": closest["label"],
})
detector = horus.Node(
name="yolo",
subs=[horus.Image],
pubs=["obstacle"],
tick=detect_tick,
rate=30,
)
horus.run(detector)
# Terminal 1 — compiled motor controller (real-time, 1kHz)
horus run motor_controller
# Terminal 2 — Python ML inference (best-effort, 30Hz)
horus run ml_detector.py
The Image flows from the compiled process through shared memory. The Python process gets a zero-copy view -- no serialization, no copying megabytes of pixel data.
Binary Compatibility
Typed messages are binary-compatible across languages. A CmdVel published by a compiled process is received as a horus.CmdVel in Python with identical field values and memory layout. This works because all typed messages use the same Pod (Plain Old Data) layout in shared memory.
# Python receives typed messages from any language
cmd = node.recv("cmd_vel") # horus.CmdVel — same binary layout
imu = node.recv("imu") # horus.Imu — same binary layout
scan = node.recv("scan") # horus.LaserScan — same binary layout
No protobuf, no JSON, no serialization step. The message bytes in shared memory are read directly.
When to Use Multi-Process
| Factor | Single Process | Multi-Process |
|---|---|---|
| Latency | ~500ns (intra-process) | ~1-5us (cross-process) |
| GIL | All nodes share one GIL | Each process has its own GIL |
| Fault isolation | One crash takes down everything | A crash is contained to one process |
| Languages | Python only | Mix Python + compiled languages |
| Restart | Must restart everything | Restart one process independently |
| Debugging | Single debugger session | Attach debugger to one process |
| Deployment | One script to run | Multiple scripts/services |
| Memory | Shared address space | Separate address spaces |
Use single-process when
- All nodes are Python
- You need deterministic ordering between nodes (sensor, then controller, then actuator -- in that order)
- Latency at the sub-microsecond level matters
- Simpler deployment is preferred
Use multi-process when
- GIL bypass -- CPU-bound Python nodes (ML inference, image processing) block the GIL. Separate processes give each node its own GIL
- Fault isolation -- a segfault in one process (e.g., a buggy C extension) does not crash the rest
- Mixed languages -- pair Python ML with compiled real-time control
- Independent restart -- update one node without stopping others
- Independent scaling -- run the heavy ML inference process on a GPU machine, sensors on the robot
The GIL Problem
Python's Global Interpreter Lock means only one thread executes Python bytecode at a time. In a single HORUS process, if one node does heavy computation (ML inference, image processing), it blocks all other nodes until it finishes.
Multi-process solves this completely. Each process has its own Python interpreter and its own GIL:
# PROBLEM: Single process, GIL blocks everything
# sensor_tick waits while inference_tick holds the GIL
# SOLUTION: Separate processes
# sensor.py — runs at 100Hz, unblocked
# inference.py — runs at 10Hz, heavy computation, own GIL
What Happens When a Process Crashes
When a Python process dies (exception, SIGKILL, OOM):
- SHM files persist -- the kernel closes file descriptors but the shared memory region stays
- Other processes continue -- subscribers see no new messages from the dead publisher, but they do not crash
- Automatic reconnection -- when the crashed process restarts and recreates its topics, other processes see fresh data again
- Automatic cleanup -- the next
horusCLI command orhorus.run()call auto-cleans stale namespaces
# Process A crashes mid-publish
# Process B keeps running, ticking normally
# Process A restarts
# Process B sees fresh data from Process A — no reconfiguration
Debugging Multi-Process Systems
See all topics from all processes
horus topic list
This shows every topic in the current namespace, regardless of which process created it. Use --verbose to see publisher/subscriber PIDs.
Monitor cross-process data flow
# Watch messages on a topic (from any process)
horus topic echo sensor.temp
# Check actual publishing rate
horus topic hz sensor.temp
# Measure bandwidth
horus topic bw camera.rgb
See all running nodes
horus node list
# Shows all nodes across all processes with PID, rate, CPU usage
Debug one process at a time
# Start sensor normally
horus run sensor.py
# Start controller with verbose logging
HORUS_LOG=debug horus run controller.py
System-wide view
horus monitor
# Web dashboard at http://localhost:3000 showing all nodes from all processes
Common debugging workflow
horus topic list-- verify all processes see the expected topicshorus topic hz sensor.data-- verify the publisher sends at the expected ratehorus topic echo sensor.data-- verify message content is correcthorus node list-- verify all nodes areRunning(notErrororCrashed)
Complete Example: Three-Process Pipeline
A camera process captures frames. An ML process detects objects. A controller process drives the robot.
camera.py -- captures at 30 FPS:
import horus
import numpy as np
def camera_tick(node):
# Simulate camera capture
pixels = np.random.randint(0, 255, (480, 640, 3), dtype=np.uint8)
img = horus.Image.from_numpy(pixels)
node.send("camera.rgb", img)
cam = horus.Node("camera", pubs=[horus.Image], tick=camera_tick, rate=30)
horus.run(cam)
detector.py -- ML inference at 10 FPS:
import horus
def detect_tick(node):
if node.has_msg("camera.rgb"):
img = node.recv("camera.rgb")
frame = img.to_numpy() # zero-copy
# Run detection model
detections = my_model.predict(frame)
node.send("detections", {
"count": len(detections),
"closest_distance": min(d["dist"] for d in detections) if detections else 999.0,
})
det = horus.Node("detector", subs=[horus.Image], pubs=["detections"], tick=detect_tick, rate=10)
horus.run(det)
controller.py -- drives motors at 50Hz:
import horus
def control_tick(node):
if node.has_msg("detections"):
det = node.recv("detections")
if det["closest_distance"] < 1.0:
# Obstacle close — stop
node.send("cmd_vel", horus.CmdVel(linear=0.0, angular=0.0))
else:
# Clear path — drive forward
node.send("cmd_vel", horus.CmdVel(linear=0.5, angular=0.0))
ctrl = horus.Node("controller", subs=["detections"], pubs=[horus.CmdVel], tick=control_tick, rate=50)
horus.run(ctrl)
Run all three:
# Option A: separate terminals (set same namespace)
HORUS_NAMESPACE=robot horus run camera.py
HORUS_NAMESPACE=robot horus run detector.py
HORUS_NAMESPACE=robot horus run controller.py
# Option B: launch file
horus launch robot_launch.yaml
# Option C: single command
horus run camera.py detector.py controller.py
Cleaning Up
Shared memory files persist after processes exit. HORUS auto-cleans stale regions on every horus CLI command and every horus.run() call. For manual cleanup:
horus clean --shm
Common Errors
| Error | Cause | Fix |
|---|---|---|
| Topics not visible across terminals | Different SHM namespaces | Set HORUS_NAMESPACE=shared in both terminals |
| Type mismatch on topic | Process A uses CmdVel, Process B uses different type for same name | Ensure both processes use the same message type for the same topic name |
| Stale data after crash | SHM files persist after process death | Auto-cleaned on next horus run. Manual: horus clean --shm |
| High message drops | Subscriber is slower than publisher | Increase subscriber rate or decrease publisher rate |
| Permission denied on SHM | Different users running processes | Run both as the same user |
Design Decisions
Why auto-discovery via shared memory names instead of a configuration file? When you use a topic name, HORUS maps that name deterministically to a shared memory region. Any process that uses the same name connects to the same region. There is no registration step, no discovery protocol, and no configuration listing endpoints. This eliminates an entire class of misconfiguration bugs ("I forgot to register my topic") and means processes can start and stop in any order. The cost is that topics only work on a single machine -- cross-machine communication requires an explicit network bridge.
Why no message broker? Brokers (like DDS in ROS2 or MQTT) add a routing hop between every publisher and subscriber. Even optimized brokers add latency and create a single point of failure. HORUS uses direct shared memory: publishers write to a ring buffer, subscribers read from it. This gives microsecond-level latency and means there is no central process that can crash and take down communication. The tradeoff is single-machine scope.
Why separate processes instead of threads for GIL bypass? Python threads share the GIL, so CPU-bound work in one thread blocks all others. multiprocessing uses fork/spawn which requires pickling data across process boundaries. HORUS processes use shared memory topics -- the same API as single-process -- so splitting into multiple processes requires zero code changes. Each process gets its own GIL, its own memory space, and fault isolation.
Why kernel-managed namespaces instead of a registry? Shared memory is a kernel-level namespace. Any process on the same machine that opens the same named region gets the same memory. This is inherently race-free and requires no coordination daemon. A registry-based approach would need a long-running process to maintain state, which adds complexity and a failure point.
Trade-offs
| Area | Benefit | Cost |
|---|---|---|
| Auto-discovery | Zero configuration; start/stop in any order | No explicit topology -- use horus topic list to audit connections |
| No broker | Microsecond latency; no single point of failure | Single-machine only; cross-machine needs a network bridge |
| Process isolation | One crash does not take down the system; independent restart | Higher latency (~1-5us cross-process vs ~500ns same-process) |
| Separate GILs | CPU-bound nodes do not block each other | Higher memory usage; one Python interpreter per process |
| Shared memory persistence | Fast reconnection; no handshake on restart | Stale files persist after crashes; auto-cleaned on next startup |
| No cross-process ordering | Each process runs at its own rate independently | Sensor-to-actuator chains across processes depend on timing, not scheduler order |
See Also
- Python Bindings -- Core Python API reference
- Shared Memory -- SHM architecture and ring buffers
- Message Design (Python) -- Choosing between dict and typed topics
- Async Nodes -- Non-blocking I/O nodes