Scheduler Configuration
You need to configure how your robot's nodes execute: which ones get real-time threads, how to handle deadline misses, and what order nodes tick in. This guide covers the full scheduler and node builder API.
When To Use This
- You are moving beyond the defaults and need per-node timing, priority, or failure handling
- You need to assign execution classes (RT, Compute, Event, AsyncIo) to different workloads
- You are configuring a production system with watchdogs, blackbox, or RT requirements
Use Scheduler Concepts instead if you need to understand how the scheduler works before configuring it.
Prerequisites
- Familiarity with Nodes and Scheduler
- Understanding of Execution Classes
Creating a Scheduler
Every scheduler starts with Scheduler::new(). From there you can optionally set global parameters with builder methods before adding nodes:
// simplified
use horus::prelude::*;
fn main() -> Result<()> {
let mut scheduler = Scheduler::new()
.tick_rate(1000_u64.hz()); // Global tick rate (default: 100 Hz)
// ... add nodes ...
scheduler.run()?;
Ok(())
}
Builder Methods
| Method | Description | Default |
|---|---|---|
.tick_rate(freq) | Global scheduler tick rate | 100 Hz |
.deterministic(bool) | Deterministic mode — SimClock, dependency ordering, seeded RNG. See Deterministic Mode | false |
.watchdog(Duration) | Frozen node detection — auto-creates safety monitor | disabled |
.blackbox(size_mb) | BlackBox flight recorder (n MB ring buffer) | disabled |
.max_deadline_misses(n) | Emergency stop after n deadline misses | 100 |
.require_rt() | Hard real-time — panics without RT capabilities | — |
.prefer_rt() | Request RT features (degrades gracefully) | — |
.cores(&[usize]) | Pin scheduler threads to specific CPU cores | all cores |
.verbose(bool) | Enable/disable non-emergency logging | true |
.with_recording() | Enable record/replay | — |
.telemetry(endpoint) | Export telemetry to UDP/file endpoint | disabled |
Adding Nodes
Add nodes with scheduler.add(n), then chain configuration calls, and finalize with .build()?:
// simplified
use horus::prelude::*;
fn main() -> Result<()> {
let mut scheduler = Scheduler::new()
.tick_rate(1000_u64.hz());
// Real-time motor control — runs first every tick
scheduler.add(MotorController::new("arm"))
.order(0)
.rate(1000_u64.hz())
.on_miss(Miss::SafeMode)
.build()?;
// Sensor node — high priority, custom rate
scheduler.add(LidarDriver::new("/dev/lidar0"))
.order(10)
.rate(500_u64.hz())
.build()?;
// Compute-heavy planning — runs on a worker thread
scheduler.add(PathPlanner::new())
.order(50)
.compute()
.build()?;
// Event-driven node — wakes only when the topic has new data
scheduler.add(CollisionChecker::new())
.on("lidar.points")
.build()?;
// Async I/O — network or disk, never blocks the real-time loop
scheduler.add(TelemetryUploader::new())
.order(200)
.async_io()
.rate(10_u64.hz())
.build()?;
scheduler.run()?;
Ok(())
}
Execution Classes
Every node belongs to exactly one execution class. Set it in the builder chain:
| Method | Class | Description |
|---|---|---|
.compute() | Compute | Offloaded to a worker thread pool. Use for planning, SLAM, or ML inference. |
.on(topic) | Event-Driven | Wakes only when the named topic receives new data. |
.async_io() | Async I/O | Runs on an async executor. Use for network, disk, or cloud calls. |
If no execution class is specified, the node defaults to BestEffort. A node is automatically promoted to the RT class when you set .rate(Frequency) (which auto-derives budget at 80% and deadline at 95% of the period).
When to Use Each Class
- RT (auto-detected) — Motor controllers, safety monitors, sensor fusion, anything that must run every tick with bounded latency. Triggered by
.rate(Frequency)on a BestEffort node. .compute()— Path planning, point cloud processing, ML inference. These can take longer than a single tick without blocking RT nodes..on(topic)— Collision detection, event handlers, reactive behaviors. Only runs when there is new data, saving CPU when idle..async_io()— Telemetry upload, log shipping, cloud API calls. Never blocks any real-time or compute work.
What each class means for your robot:
- RT — Your motor controller sends PWM commands every millisecond. Missing one cycle causes the motor to overshoot. This node needs a dedicated RT thread.
- Compute — Your SLAM algorithm takes 50ms to process a lidar scan. If it runs on the RT thread, the motor controller misses 50 deadlines. Compute nodes run on a separate thread pool.
- Event — Your collision detector only needs to run when new lidar data arrives, not every cycle. Event nodes sleep until their topic gets a message.
- AsyncIo — Your telemetry node uploads data to a cloud server. Network calls can take seconds. AsyncIo nodes run on a tokio thread pool so they never block anything.
- BestEffort — Your debug logger. Runs on the main thread when there's time, no timing guarantees.
Per-Node Configuration
Ordering and Timing
| Method | Description |
|---|---|
.order(n) | Tiebreaker for independent nodes (lower = runs first). Optional when nodes have topic dependencies — the dependency graph handles ordering automatically |
.rate(Frequency) | Node-specific tick rate — auto-derives budget (80%) and deadline (95%), auto-marks as RT |
.budget(Duration) | Override auto-derived tick budget (max execution time) |
.deadline(Duration) | Override auto-derived absolute deadline |
.on_miss(Miss) | What to do on deadline miss (Miss::Warn, Miss::Skip, Miss::SafeMode, Miss::Stop) |
RT Configuration
| Method | Description |
|---|---|
.priority(i32) | OS thread priority (SCHED_FIFO 1-99) for this node's RT thread |
.core(usize) | Pin this node's RT thread to a specific CPU core |
.watchdog(Duration) | Per-node watchdog timeout (overrides scheduler global) |
These are only meaningful for RT nodes (nodes with .rate()). They require Linux with CAP_SYS_NICE and degrade gracefully when RT capabilities are unavailable.
// simplified
// Safety-critical node: highest priority, pinned to core 2, tight watchdog
scheduler.add(EmergencyStop::new())
.order(0)
.rate(1000_u64.hz())
.priority(99)
.core(2)
.watchdog(2_u64.ms())
.on_miss(Miss::Stop)
.build()?;
// Logger: long watchdog, async I/O
scheduler.add(Logger::new())
.order(200)
.async_io()
.watchdog(5_u64.secs())
.build()?;
Failure Policy
| Method | Description |
|---|---|
.failure_policy(policy) | Per-node failure handling (see Fault Tolerance) |
.build() | Finalize and register the node (returns Result) |
Order Guidelines
- 0-9: Critical real-time (motor control, safety)
- 10-49: High priority (sensors, fast control loops)
- 50-99: Normal priority (processing, planning)
- 100-199: Low priority (logging, diagnostics)
- 200+: Background (telemetry, non-essential)
Global Configuration with Composable Builders
Compose the builder methods you need for each deployment stage:
// simplified
use horus::prelude::*;
// Development — lightweight, profiling is always-on
let mut scheduler = Scheduler::new()
.tick_rate(1000_u64.hz());
// Production — watchdog + blackbox
let mut scheduler = Scheduler::new()
.watchdog(500_u64.ms())
.blackbox(64)
.tick_rate(1000_u64.hz());
// Hard real-time — panics without RT capabilities
let mut scheduler = Scheduler::new()
.require_rt()
.tick_rate(1000_u64.hz());
// Safety-critical — require_rt + blackbox + strict deadline misses
let mut scheduler = Scheduler::new()
.require_rt()
.watchdog(500_u64.ms())
.blackbox(64)
.tick_rate(1000_u64.hz())
.max_deadline_misses(3);
Execution Modes
HORUS automatically parallelizes independent nodes while maintaining causal ordering for dependent nodes. No manual configuration needed — the scheduler builds a dependency graph from topic send()/recv() metadata.
No. Scheduler::new() gives you automatic parallelism with dependency-based ordering. Independent nodes run on multiple cores. Dependent nodes execute in the correct causal order. Just add your nodes and call run().
| Your Situation | Recommended Setup |
|---|---|
| Learning HORUS | Scheduler::new() — parallelism is automatic |
| Prototyping | Scheduler::new() |
| Need reproducible runs | Scheduler::new().deterministic(true) — same graph, sequential execution |
| Safety-critical (medical, aerospace) | Scheduler::new().require_rt().tick_rate(1000_u64.hz()) with .rate() |
Default Mode (Auto-Parallel)
The scheduler builds a dependency graph from topic metadata and dispatches independent nodes to a thread pool via the ready-dispatch executor. Each node starts the instant its last dependency finishes — no barriers, no wasted time.
| Metric | Value |
|---|---|
| Independent nodes | Parallel (multi-core) |
| Dependent nodes | Causal order (publisher before subscriber) |
| Latency | Optimal — critical path only |
.order() needed | No — optional tiebreaker |
use horus::prelude::*;
let mut scheduler = Scheduler::new();
// Independent sensors — run in parallel automatically
scheduler.add(lidar_node).build()?; // publishes "scan"
scheduler.add(camera_node).build()?; // publishes "image"
scheduler.add(imu_node).build()?; // publishes "imu"
// Fusion depends on all three — waits for all to complete
scheduler.add(fusion_node).build()?; // subscribes "scan", "image", "imu"
scheduler.run()?;
Deterministic Mode
Uses the same dependency graph but executes nodes sequentially within each step. SimClock advances between steps. Produces bit-identical results across runs.
| Metric | Value |
|---|---|
| Independent nodes | Sequential (reproducible order) |
| Dependent nodes | Causal order (same as default) |
| Clock | Virtual SimClock |
| Best For | Simulation, testing, replay, CI |
use horus::prelude::*;
let mut scheduler = Scheduler::new()
.deterministic(true)
.tick_rate(100_u64.hz());
scheduler.add(sensor).build()?;
scheduler.add(controller).build()?;
scheduler.run()?;
Mode Comparison
| Feature | Default (Auto-Parallel) | Deterministic |
|---|---|---|
| Independent nodes | Parallel (multi-core) | Sequential (reproducible) |
| Dependent nodes | Causal order | Causal order |
| Clock | Wall clock | SimClock |
| Certification ready | Yes (causal ordering guaranteed) | Yes (fully reproducible) |
DurationExt and Frequency
HORUS provides ergonomic extension methods for creating Duration and Frequency values, replacing verbose Duration::from_micros(200) calls:
Duration Helpers
// simplified
use horus::prelude::*;
// Microseconds
let budget = 200_u64.us(); // Duration::from_micros(200)
// Milliseconds
let deadline = 1_u64.ms(); // Duration::from_millis(1)
// Seconds
let timeout = 5_u64.secs(); // Duration::from_secs(5)
Works on u64 literals via the DurationExt trait.
Frequency Type
The .hz() method creates a Frequency that auto-derives timing parameters:
// simplified
use horus::prelude::*;
let freq = 100_u64.hz();
freq.value() // 100.0 Hz
freq.period() // 10ms (1/frequency)
freq.budget_default() // 8ms (80% of period)
freq.deadline_default() // 9.5ms (95% of period)
Use Frequency with the node builder's .rate() method to auto-configure RT timing:
// simplified
// Auto-derives budget (80% period) and deadline (95% period)
// Also auto-marks the node as RT
scheduler.add(motor_ctrl)
.order(0)
.rate(500_u64.hz()) // period=2ms, budget=1.6ms, deadline=1.9ms
.on_miss(Miss::Skip)
.build()?;
| Method | Returns | Description |
|---|---|---|
.us() | Duration | Microseconds |
.ms() | Duration | Milliseconds |
.secs() | Duration | Seconds |
.hz() | Frequency | Frequency in Hz |
freq.value() | f64 | Frequency in Hz |
freq.period() | Duration | 1/frequency |
freq.budget_default() | Duration | 80% of period |
freq.deadline_default() | Duration | 95% of period |
Design Decisions
Why auto-derive budget and deadline from .rate()?
Most developers think in terms of "this node runs at 1kHz" rather than "this node has an 800us budget and 950us deadline." Auto-derivation (budget = 80% period, deadline = 95% period) provides safe defaults without requiring timing expertise. Override with explicit .budget() and .deadline() when profiling shows different requirements.
Why composable builders instead of presets?
Early versions of HORUS had presets like deploy() and hard_rt(). These were removed because real systems need specific combinations of features. Composable builders let you pick exactly what you need: .watchdog(500_u64.ms()).blackbox(64) is clearer than a preset that might enable features you do not want.
Why .order() instead of automatic dependency ordering?
Explicit ordering is predictable and debuggable. Automatic dependency ordering (available in deterministic mode) requires publishers() and subscribers() metadata on every node. In normal mode, .order() gives you full control without metadata overhead.
Trade-offs
| Gain | Cost |
|---|---|
| Per-node execution classes match workload to executor | More configuration decisions when adding nodes |
Auto-derived timing from .rate() reduces configuration | 80%/95% defaults may not match your workload profile |
| Composable builders allow precise feature selection | No single-line "production mode" shortcut |
Explicit .order() is predictable | Must be maintained manually as nodes are added |
Common Errors
| Symptom | Cause | Fix |
|---|---|---|
| Node runs as BestEffort when you expected RT | .rate() not set, or .compute() overrides it | Set .rate(freq) and do not combine with .compute() |
| "Cannot set SCHED_FIFO" at startup | Missing RT permissions | See RT Setup for limits.conf and setcap |
| Deadline misses on every tick | Budget too tight for actual computation time | Profile with horus monitor, then increase .budget() or lower .rate() |
| Node never ticks | .on(topic) set but no publisher on that topic | Verify another node publishes to the same topic name |
.build() returns error | Conflicting configuration (e.g., .on() with .budget()) | Event nodes cannot have budgets. Remove timing constraints from event nodes |
| Nodes execute in wrong order | Topic dependencies not detected (no send()/recv() calls) | Ensure nodes call send()/recv() during init() or first tick. Use .order() as fallback for non-topic dependencies |
See Also
- Scheduler Concepts — How the scheduler works
- Execution Classes — The 5 execution classes and when to use each
- Safety Monitor — Watchdog and deadline enforcement
- Fault Tolerance — Failure policies and recovery
- RT Setup — Linux real-time kernel configuration