Scheduler Configuration

You need to configure how your robot's nodes execute: which ones get real-time threads, how to handle deadline misses, and what order nodes tick in. This guide covers the full scheduler and node builder API.

When To Use This

  • You are moving beyond the defaults and need per-node timing, priority, or failure handling
  • You need to assign execution classes (RT, Compute, Event, AsyncIo) to different workloads
  • You are configuring a production system with watchdogs, blackbox, or RT requirements

Use Scheduler Concepts instead if you need to understand how the scheduler works before configuring it.

Prerequisites

Creating a Scheduler

Every scheduler starts with Scheduler::new(). From there you can optionally set global parameters with builder methods before adding nodes:

// simplified
use horus::prelude::*;

fn main() -> Result<()> {
    let mut scheduler = Scheduler::new()
        .tick_rate(1000_u64.hz());     // Global tick rate (default: 100 Hz)

    // ... add nodes ...

    scheduler.run()?;
    Ok(())
}

Builder Methods

MethodDescriptionDefault
.tick_rate(freq)Global scheduler tick rate100 Hz
.deterministic(bool)Deterministic mode — SimClock, dependency ordering, seeded RNG. See Deterministic Modefalse
.watchdog(Duration)Frozen node detection — auto-creates safety monitordisabled
.blackbox(size_mb)BlackBox flight recorder (n MB ring buffer)disabled
.max_deadline_misses(n)Emergency stop after n deadline misses100
.require_rt()Hard real-time — panics without RT capabilities
.prefer_rt()Request RT features (degrades gracefully)
.cores(&[usize])Pin scheduler threads to specific CPU coresall cores
.verbose(bool)Enable/disable non-emergency loggingtrue
.with_recording()Enable record/replay
.telemetry(endpoint)Export telemetry to UDP/file endpointdisabled

Adding Nodes

Add nodes with scheduler.add(n), then chain configuration calls, and finalize with .build()?:

// simplified
use horus::prelude::*;

fn main() -> Result<()> {
    let mut scheduler = Scheduler::new()
        .tick_rate(1000_u64.hz());

    // Real-time motor control — runs first every tick
    scheduler.add(MotorController::new("arm"))
        .order(0)
        .rate(1000_u64.hz())
        .on_miss(Miss::SafeMode)
        .build()?;

    // Sensor node — high priority, custom rate
    scheduler.add(LidarDriver::new("/dev/lidar0"))
        .order(10)
        .rate(500_u64.hz())
        .build()?;

    // Compute-heavy planning — runs on a worker thread
    scheduler.add(PathPlanner::new())
        .order(50)
        .compute()
        .build()?;

    // Event-driven node — wakes only when the topic has new data
    scheduler.add(CollisionChecker::new())
        .on("lidar.points")
        .build()?;

    // Async I/O — network or disk, never blocks the real-time loop
    scheduler.add(TelemetryUploader::new())
        .order(200)
        .async_io()
        .rate(10_u64.hz())
        .build()?;

    scheduler.run()?;
    Ok(())
}

Execution Classes

Every node belongs to exactly one execution class. Set it in the builder chain:

MethodClassDescription
.compute()ComputeOffloaded to a worker thread pool. Use for planning, SLAM, or ML inference.
.on(topic)Event-DrivenWakes only when the named topic receives new data.
.async_io()Async I/ORuns on an async executor. Use for network, disk, or cloud calls.

If no execution class is specified, the node defaults to BestEffort. A node is automatically promoted to the RT class when you set .rate(Frequency) (which auto-derives budget at 80% and deadline at 95% of the period).

When to Use Each Class

  • RT (auto-detected) — Motor controllers, safety monitors, sensor fusion, anything that must run every tick with bounded latency. Triggered by .rate(Frequency) on a BestEffort node.
  • .compute() — Path planning, point cloud processing, ML inference. These can take longer than a single tick without blocking RT nodes.
  • .on(topic) — Collision detection, event handlers, reactive behaviors. Only runs when there is new data, saving CPU when idle.
  • .async_io() — Telemetry upload, log shipping, cloud API calls. Never blocks any real-time or compute work.

What each class means for your robot:

  • RT — Your motor controller sends PWM commands every millisecond. Missing one cycle causes the motor to overshoot. This node needs a dedicated RT thread.
  • Compute — Your SLAM algorithm takes 50ms to process a lidar scan. If it runs on the RT thread, the motor controller misses 50 deadlines. Compute nodes run on a separate thread pool.
  • Event — Your collision detector only needs to run when new lidar data arrives, not every cycle. Event nodes sleep until their topic gets a message.
  • AsyncIo — Your telemetry node uploads data to a cloud server. Network calls can take seconds. AsyncIo nodes run on a tokio thread pool so they never block anything.
  • BestEffort — Your debug logger. Runs on the main thread when there's time, no timing guarantees.

Per-Node Configuration

Ordering and Timing

MethodDescription
.order(n)Tiebreaker for independent nodes (lower = runs first). Optional when nodes have topic dependencies — the dependency graph handles ordering automatically
.rate(Frequency)Node-specific tick rate — auto-derives budget (80%) and deadline (95%), auto-marks as RT
.budget(Duration)Override auto-derived tick budget (max execution time)
.deadline(Duration)Override auto-derived absolute deadline
.on_miss(Miss)What to do on deadline miss (Miss::Warn, Miss::Skip, Miss::SafeMode, Miss::Stop)

RT Configuration

MethodDescription
.priority(i32)OS thread priority (SCHED_FIFO 1-99) for this node's RT thread
.core(usize)Pin this node's RT thread to a specific CPU core
.watchdog(Duration)Per-node watchdog timeout (overrides scheduler global)

These are only meaningful for RT nodes (nodes with .rate()). They require Linux with CAP_SYS_NICE and degrade gracefully when RT capabilities are unavailable.

// simplified
// Safety-critical node: highest priority, pinned to core 2, tight watchdog
scheduler.add(EmergencyStop::new())
    .order(0)
    .rate(1000_u64.hz())
    .priority(99)
    .core(2)
    .watchdog(2_u64.ms())
    .on_miss(Miss::Stop)
    .build()?;

// Logger: long watchdog, async I/O
scheduler.add(Logger::new())
    .order(200)
    .async_io()
    .watchdog(5_u64.secs())
    .build()?;

Failure Policy

MethodDescription
.failure_policy(policy)Per-node failure handling (see Fault Tolerance)
.build()Finalize and register the node (returns Result)

Order Guidelines

  • 0-9: Critical real-time (motor control, safety)
  • 10-49: High priority (sensors, fast control loops)
  • 50-99: Normal priority (processing, planning)
  • 100-199: Low priority (logging, diagnostics)
  • 200+: Background (telemetry, non-essential)

Global Configuration with Composable Builders

Compose the builder methods you need for each deployment stage:

// simplified
use horus::prelude::*;

// Development — lightweight, profiling is always-on
let mut scheduler = Scheduler::new()
    .tick_rate(1000_u64.hz());

// Production — watchdog + blackbox
let mut scheduler = Scheduler::new()
    .watchdog(500_u64.ms())
    .blackbox(64)
    .tick_rate(1000_u64.hz());

// Hard real-time — panics without RT capabilities
let mut scheduler = Scheduler::new()
    .require_rt()
    .tick_rate(1000_u64.hz());

// Safety-critical — require_rt + blackbox + strict deadline misses
let mut scheduler = Scheduler::new()
    .require_rt()
    .watchdog(500_u64.ms())
    .blackbox(64)
    .tick_rate(1000_u64.hz())
    .max_deadline_misses(3);

Execution Modes

HORUS automatically parallelizes independent nodes while maintaining causal ordering for dependent nodes. No manual configuration needed — the scheduler builds a dependency graph from topic send()/recv() metadata.

ℹ️Quick Answer: Do I Need to Configure Anything?

No. Scheduler::new() gives you automatic parallelism with dependency-based ordering. Independent nodes run on multiple cores. Dependent nodes execute in the correct causal order. Just add your nodes and call run().

Your SituationRecommended Setup
Learning HORUSScheduler::new() — parallelism is automatic
PrototypingScheduler::new()
Need reproducible runsScheduler::new().deterministic(true) — same graph, sequential execution
Safety-critical (medical, aerospace)Scheduler::new().require_rt().tick_rate(1000_u64.hz()) with .rate()

Default Mode (Auto-Parallel)

The scheduler builds a dependency graph from topic metadata and dispatches independent nodes to a thread pool via the ready-dispatch executor. Each node starts the instant its last dependency finishes — no barriers, no wasted time.

MetricValue
Independent nodesParallel (multi-core)
Dependent nodesCausal order (publisher before subscriber)
LatencyOptimal — critical path only
.order() neededNo — optional tiebreaker
use horus::prelude::*;

let mut scheduler = Scheduler::new();

// Independent sensors — run in parallel automatically
scheduler.add(lidar_node).build()?;    // publishes "scan"
scheduler.add(camera_node).build()?;   // publishes "image"
scheduler.add(imu_node).build()?;      // publishes "imu"

// Fusion depends on all three — waits for all to complete
scheduler.add(fusion_node).build()?;   // subscribes "scan", "image", "imu"
scheduler.run()?;

Deterministic Mode

Uses the same dependency graph but executes nodes sequentially within each step. SimClock advances between steps. Produces bit-identical results across runs.

MetricValue
Independent nodesSequential (reproducible order)
Dependent nodesCausal order (same as default)
ClockVirtual SimClock
Best ForSimulation, testing, replay, CI
use horus::prelude::*;

let mut scheduler = Scheduler::new()
    .deterministic(true)
    .tick_rate(100_u64.hz());
scheduler.add(sensor).build()?;
scheduler.add(controller).build()?;
scheduler.run()?;

Mode Comparison

FeatureDefault (Auto-Parallel)Deterministic
Independent nodesParallel (multi-core)Sequential (reproducible)
Dependent nodesCausal orderCausal order
ClockWall clockSimClock
Certification readyYes (causal ordering guaranteed)Yes (fully reproducible)

DurationExt and Frequency

HORUS provides ergonomic extension methods for creating Duration and Frequency values, replacing verbose Duration::from_micros(200) calls:

Duration Helpers

// simplified
use horus::prelude::*;

// Microseconds
let budget = 200_u64.us();     // Duration::from_micros(200)

// Milliseconds
let deadline = 1_u64.ms();     // Duration::from_millis(1)

// Seconds
let timeout = 5_u64.secs();    // Duration::from_secs(5)

Works on u64 literals via the DurationExt trait.

Frequency Type

The .hz() method creates a Frequency that auto-derives timing parameters:

// simplified
use horus::prelude::*;

let freq = 100_u64.hz();

freq.value()            // 100.0 Hz
freq.period()           // 10ms (1/frequency)
freq.budget_default()   // 8ms  (80% of period)
freq.deadline_default() // 9.5ms (95% of period)

Use Frequency with the node builder's .rate() method to auto-configure RT timing:

// simplified
// Auto-derives budget (80% period) and deadline (95% period)
// Also auto-marks the node as RT
scheduler.add(motor_ctrl)
    .order(0)
    .rate(500_u64.hz())   // period=2ms, budget=1.6ms, deadline=1.9ms
    .on_miss(Miss::Skip)
    .build()?;
MethodReturnsDescription
.us()DurationMicroseconds
.ms()DurationMilliseconds
.secs()DurationSeconds
.hz()FrequencyFrequency in Hz
freq.value()f64Frequency in Hz
freq.period()Duration1/frequency
freq.budget_default()Duration80% of period
freq.deadline_default()Duration95% of period

Design Decisions

Why auto-derive budget and deadline from .rate()?

Most developers think in terms of "this node runs at 1kHz" rather than "this node has an 800us budget and 950us deadline." Auto-derivation (budget = 80% period, deadline = 95% period) provides safe defaults without requiring timing expertise. Override with explicit .budget() and .deadline() when profiling shows different requirements.

Why composable builders instead of presets?

Early versions of HORUS had presets like deploy() and hard_rt(). These were removed because real systems need specific combinations of features. Composable builders let you pick exactly what you need: .watchdog(500_u64.ms()).blackbox(64) is clearer than a preset that might enable features you do not want.

Why .order() instead of automatic dependency ordering?

Explicit ordering is predictable and debuggable. Automatic dependency ordering (available in deterministic mode) requires publishers() and subscribers() metadata on every node. In normal mode, .order() gives you full control without metadata overhead.

Trade-offs

GainCost
Per-node execution classes match workload to executorMore configuration decisions when adding nodes
Auto-derived timing from .rate() reduces configuration80%/95% defaults may not match your workload profile
Composable builders allow precise feature selectionNo single-line "production mode" shortcut
Explicit .order() is predictableMust be maintained manually as nodes are added

Common Errors

SymptomCauseFix
Node runs as BestEffort when you expected RT.rate() not set, or .compute() overrides itSet .rate(freq) and do not combine with .compute()
"Cannot set SCHED_FIFO" at startupMissing RT permissionsSee RT Setup for limits.conf and setcap
Deadline misses on every tickBudget too tight for actual computation timeProfile with horus monitor, then increase .budget() or lower .rate()
Node never ticks.on(topic) set but no publisher on that topicVerify another node publishes to the same topic name
.build() returns errorConflicting configuration (e.g., .on() with .budget())Event nodes cannot have budgets. Remove timing constraints from event nodes
Nodes execute in wrong orderTopic dependencies not detected (no send()/recv() calls)Ensure nodes call send()/recv() during init() or first tick. Use .order() as fallback for non-topic dependencies

See Also