Scheduler
Key Takeaways
After reading this guide, you will understand:
- How the Scheduler orchestrates node execution through init(), tick(), and shutdown() phases
Scheduler::new()as the single entry point, with builder methods for global settings- Per-node execution classes:
.compute(),.on(topic),.async_io()— plus auto-RT from.budget()/.rate() - The fluent NodeBuilder API for adding nodes with
.add(node).order(0).build() - Per-node configuration:
.order(),.rate(),.budget(),.deadline(),.on_miss() - Composable scheduler builders:
.watchdog(Duration),.blackbox(n),.require_rt(),.max_deadline_misses(n) - Priority-based execution where lower numbers run first (0 = highest priority)
- Graceful shutdown via Ctrl+C signal handling
The Scheduler is the execution orchestrator in HORUS. It manages the node lifecycle, coordinates priority-based execution, and handles graceful shutdown.
What is the Scheduler?
The Scheduler is responsible for:
Node Registration: Adding nodes with the fluent builder API
Lifecycle Management: Calling init(), tick(), and shutdown() at the right times
Priority-Based Execution: Running nodes in priority order every tick
Signal Handling: Graceful shutdown on Ctrl+C
Performance Monitoring: Tracking execution metrics for all nodes
Failure Policies: Per-node failure handling (fatal, restart, skip, ignore)
Creating a Scheduler
Scheduler::new() — Lightweight, No Syscalls
new() detects runtime capabilities (~30-100us) but does not apply any OS-level features. Use builder methods to opt in to features.
use horus::prelude::*;
// Minimal — just capability detection, no syscalls
let mut scheduler = Scheduler::new();
scheduler.add(my_node).order(0).build()?;
scheduler.run()?;
Builder Methods — Global Settings
For production deployments, combine builder methods for the right settings:
use horus::prelude::*;
// Production robot — watchdog, blackbox, 1 kHz control loop
let mut scheduler = Scheduler::new()
.watchdog(500_u64.ms()) // frozen node detection
.blackbox(64) // 64 MB flight recorder
.tick_rate(1000_u64.hz()); // 1 kHz control loop
scheduler.add(motor_ctrl).order(0).rate(1000_u64.hz()).build()?;
scheduler.run()?;
All RT features that cannot be applied at runtime are recorded as degradations, not errors (see Real-Time Features below).
Composable Builders
Instead of presets, combine builder methods for your deployment needs:
| Builder Method | What it enables | When to use |
|---|---|---|
.watchdog(Duration) | Frozen node detection — auto-creates safety monitor | Production robots |
.blackbox(size_mb) | BlackBox flight recorder with n MB buffer | Post-mortem debugging |
.max_deadline_misses(n) | Emergency stop after n deadline misses (default: 100) | Safety-critical |
.require_rt() | All RT features + memory locking + RT scheduling class. Panics without RT capabilities | Hard real-time systems |
.prefer_rt() | Request RT features (degrades gracefully if unavailable) | Best-effort RT |
.verbose(bool) | Enable/disable non-emergency logging (default: true) | Quieter production logs |
.with_recording() | Enable record/replay | Session recording |
Adding Nodes
Use the fluent builder API to add nodes with configuration:
let mut scheduler = Scheduler::new();
// Basic: just set execution order
scheduler.add(sensor_node).order(0).build()?;
// With per-node tick rate — auto-derives budget (80%) and deadline (95%), auto-marks as RT
scheduler.add(fast_sensor).order(0).rate(1000_u64.hz()).build()?;
// Real-time node with explicit budget and deadline (auto-marks as RT)
scheduler.add(motor_ctrl)
.order(0)
.budget(500_u64.us()) // 500μs max execution time → auto RT
.deadline(1_u64.ms()) // 1ms deadline
.on_miss(Miss::Skip) // Skip tick on deadline miss
.build()?;
// Chain multiple nodes
scheduler.add(safety_node).order(0).build()?;
scheduler.add(controller).order(10).build()?;
scheduler.add(sensor).order(50).build()?;
scheduler.add(logger).order(200).build()?;
NodeBuilder Methods
Execution classes (mutually exclusive — pick one per node):
| Method | Description |
|---|---|
.compute() | Mark as CPU-heavy compute node (may be scheduled on worker threads) |
.on(topic) | Event-driven — node ticks only when the given topic receives a message |
.async_io() | Async I/O node (non-blocking, suitable for network / file operations) |
Note: RT is auto-detected when you set
.budget(),.deadline(), or.rate(Frequency). There is no.rt()method — just set a budget or rate and the node is automatically marked as real-time. The default execution class isBestEffort.
Per-node configuration:
| Method | Description |
|---|---|
.order(n) | Set execution order (lower = runs first) |
.rate(Frequency) | Set tick rate (e.g. 1000_u64.hz()) — auto-derives budget (80% period) and deadline (95% period), auto-marks RT |
.budget(Duration) | Set tick budget (e.g. 200_u64.us()) — auto-marks as RT |
.deadline(Duration) | Set deadline (e.g. 1_u64.ms()) — auto-marks as RT |
.on_miss(Miss) | Set deadline miss policy (Warn, Skip, SafeMode, Stop) |
.failure_policy(policy) | Override failure handling policy |
.build() / .build() | Finalize and register the node (returns HorusResult) |
Priority-Based Execution
Priorities are u32 values where lower numbers = higher priority:
| Range | Level | Use Case |
|---|---|---|
| 0-9 | Critical | Safety monitors, emergency stops, watchdogs |
| 10-49 | High | Control loops, actuators, time-sensitive operations |
| 50-99 | Normal | Sensors, filters, state estimation |
| 100-199 | Low | Logging, diagnostics, non-critical computation |
| 200+ | Background | Telemetry, data recording |
Nodes execute in priority order every tick:
let mut scheduler = Scheduler::new();
// Execution order: Safety -> Controller -> Sensor -> Logger
scheduler.add(safety_monitor).order(0).build()?; // Runs 1st
scheduler.add(controller).order(10).build()?; // Runs 2nd
scheduler.add(sensor).order(50).build()?; // Runs 3rd
scheduler.add(logger).order(200).build()?; // Runs 4th
Running the Scheduler
Continuous Mode
Run until Ctrl+C:
scheduler.run()?;
Duration-Limited
Run for a specific duration, then shutdown:
use std::time::Duration;
scheduler.run_for(Duration::from_secs(30))?;
Node-Specific Execution
Execute only specific nodes by name:
// Run only these nodes continuously
scheduler.tick(&["SensorNode", "MotorNode"])?;
// Run specific nodes for a duration
scheduler.tick_for(&["SensorNode"], Duration::from_secs(10))?;
Builder Methods
Tick Rate
// Set global tick rate (default: 100 Hz)
let scheduler = Scheduler::new()
.tick_rate(1000_u64.hz()); // 1kHz control loop
Real-Time Features
RT is configured through composable builder methods. Nodes are automatically marked as RT when you set .budget(), .deadline(), or .rate(Frequency):
// Hard real-time — panics without RT capabilities
let mut scheduler = Scheduler::new()
.require_rt()
.watchdog(500_u64.ms())
.tick_rate(1000_u64.hz());
// Per-node: rate auto-derives budget + deadline, auto-marks as RT
scheduler.add(motor_ctrl)
.order(0)
.rate(1000_u64.hz()) // budget=800us, deadline=950us → auto RT
.on_miss(Miss::SafeMode) // Enter safe mode on miss
.build()?;
scheduler.add(sensor)
.order(50)
.compute() // CPU-heavy, non-RT
.build()?;
When using .require_rt(), the scheduler applies (in order):
- RT priority (SCHED_FIFO) — if available
- Memory locking (mlockall) — if permitted
- CPU affinity — only to isolated CPUs if they exist
Features that fail are recorded as degradations, not errors. Use .prefer_rt() instead if you want graceful degradation.
BlackBox Flight Recorder
Enable BlackBox recording through the builder API:
let mut scheduler = Scheduler::new()
.blackbox(16); // 16 MB flight recorder buffer
// All node ticks are now recorded to the flight recorder
Safety Monitor
.watchdog(Duration) enables frozen node detection and the safety monitor. Budget enforcement is implicit when nodes have .rate() set:
let mut scheduler = Scheduler::new()
.watchdog(500_u64.ms()) // frozen node detection
.blackbox(64) // 64 MB flight recorder
.tick_rate(1000_u64.hz());
// RT node with auto-derived budget and deadline from rate
scheduler.add(motor_ctrl)
.order(0)
.rate(1000_u64.hz()) // budget=800us, deadline=950us
.on_miss(Miss::SafeMode) // Enter safe mode on miss
.build()?;
Per-Node Rate Control
Individual nodes can run at different frequencies:
let mut scheduler = Scheduler::new();
scheduler.add(fast_sensor).order(0).rate(1000_u64.hz()).build()?; // 1kHz
scheduler.add(slow_logger).order(200).rate(10_u64.hz()).build()?; // 10Hz
// Or set rates after adding
scheduler.set_node_rate("FastSensor", 100_u64.hz());
scheduler.set_node_rate("SlowLogger", 10_u64.hz());
Ergonomic Timing with DurationExt
HORUS provides extension methods on numeric types for creating Duration and Frequency values:
use horus::prelude::*;
// Duration helpers — works on u64
let budget = 200_u64.us(); // Duration::from_micros(200)
let deadline = 1_u64.ms(); // Duration::from_millis(1)
// Frequency — auto-derives budget (80% period) and deadline (95% period)
let freq = 500_u64.hz();
scheduler.add(motor_ctrl)
.order(0)
.rate(freq) // auto-marks as RT, sets budget + deadline
.on_miss(Miss::Skip)
.build()?;
See DurationExt and Frequency for the full API reference.
The scheduler automatically adjusts its internal tick period to be fast enough for the highest-frequency node.
Lifecycle Management
Initialization Phase
When you call run(), the scheduler initializes all nodes by calling init():
- All nodes initialize before the main loop starts
- If
init()fails, the node enters Error state and won't tick - Other nodes continue normally
Main Execution Loop
1. Initialize all nodes (call init())
2. Sort nodes by priority
3. Main loop:
a. For each node (in priority order):
- Check if node should tick (rate limiting)
- Start tick timing
- Call node.tick()
- Record metrics, check budget/deadlines
b. Sleep to maintain target tick rate
4. On shutdown signal:
a. Set running = false
b. Call shutdown() on all nodes
Graceful Shutdown
Ctrl+C is automatically caught:
^C
Ctrl+C received! Shutting down HORUS scheduler...
[Nodes shutting down gracefully...]
Scheduler shutdown complete
- Main loop exits
- Each node's
shutdown()is called - Errors during shutdown are logged but don't prevent other nodes from cleaning up
- Shared memory cleaned up
Recording and Replay
Recording Sessions
Enable recording through the builder API:
let mut scheduler = Scheduler::new()
.blackbox(16); // 16 MB flight recorder buffer
scheduler.add(my_node).order(0).build()?;
// Run normally — all node ticks are recorded
scheduler.run()?;
Replaying
let mut replay_scheduler = Scheduler::replay_from(
"~/.horus/recordings/my_session".into()
)?;
replay_scheduler.run()?;
Performance Monitoring
Node Metrics
let metrics = scheduler.metrics();
for m in &metrics {
println!("Node: {} (order: {})", m.name(), m.order());
println!(" Ticks: {} total, {} ok, {} failed",
m.total_ticks(), m.successful_ticks(), m.failed_ticks());
println!(" Duration: avg={:.2}ms, min={:.2}ms, max={:.2}ms",
m.avg_tick_duration_ms(), m.min_tick_duration_ms(), m.max_tick_duration_ms());
}
Other Monitoring Methods
| Method | Description |
|---|---|
metrics() | Get Vec<NodeMetrics> for all nodes |
node_list() | Get list of registered node names |
safety_stats() | Get budget overruns, deadline misses, watchdog expirations |
is_running() | Check if scheduler is running |
Error Handling
Node Initialization Failure
If init() fails, the node enters Error state and won't tick. Other nodes continue:
fn init(&mut self) -> Result<()> {
Err(Error::node("MySensor", "Sensor not connected")) // Node won't run, others unaffected
}
Runtime Errors
Handle errors gracefully in tick() — don't panic:
// GOOD: Handle errors
fn tick(&mut self) {
match self.operation() {
Ok(_) => {}
Err(e) => hlog!(error, "Error: {}", e),
}
}
// BAD: Panic crashes the scheduler
fn tick(&mut self) {
self.operation().unwrap(); // Will crash!
}
Common Patterns
Layered Architecture
// Layer 1: Safety (Critical)
scheduler.add(collision_detector).order(0).build()?;
scheduler.add(emergency_stop).order(0).build()?;
// Layer 2: Control (High)
scheduler.add(pid_controller).order(10).build()?;
scheduler.add(motor_driver).order(10).build()?;
// Layer 3: Sensing (Normal)
scheduler.add(lidar_node).order(50).build()?;
scheduler.add(camera_node).order(50).build()?;
// Layer 4: Processing (Low)
scheduler.add(path_planner).order(100).build()?;
// Layer 5: Monitoring (Background)
scheduler.add(logger).order(200).build()?;
scheduler.add(diagnostics).order(200).build()?;
Production Deployment
let mut scheduler = Scheduler::new()
.watchdog(500_u64.ms())
.blackbox(64)
.tick_rate(1000_u64.hz());
scheduler.add(safety_monitor).order(0).rate(1000_u64.hz()).on_miss(Miss::Stop).build()?;
scheduler.add(motor_ctrl).order(5).rate(1000_u64.hz()).on_miss(Miss::SafeMode).build()?;
scheduler.add(sensor).order(50).compute().build()?;
scheduler.add(logger).order(200).async_io().build()?;
scheduler.run()?;
Best Practices
Initialize heavy resources in init() — not in the constructor:
fn init(&mut self) -> Result<()> {
self.buffer = vec![0.0; 10000];
self.connection = connect_to_hardware()?;
Ok(())
}
Keep tick() fast — each tick should complete within the tick period:
fn tick(&mut self) {
let data = self.read_sensor();
self.pub_topic.send(data); // Fast!
}
Use appropriate priorities — don't make everything order 0:
scheduler.add(emergency_stop).order(0).build()?; // Critical
scheduler.add(controller).order(10).build()?; // High
scheduler.add(sensor).order(50).build()?; // Normal
scheduler.add(logger).order(200).build()?; // Background
Use composable builders for production — enable watchdog, blackbox, and safety monitoring:
// Use composable builders for production
let mut scheduler = Scheduler::new()
.watchdog(500_u64.ms())
.blackbox(16)
.tick_rate(500_u64.hz());
Miss — Deadline Miss Policy
When a real-time node exceeds its deadline, the Miss policy determines what happens:
use horus::prelude::*;
scheduler.add(motor_ctrl)
.order(0)
.budget(500.us())
.deadline(1.ms())
.on_miss(Miss::SafeMode) // Enter safe mode on miss
.build()?;
| Policy | Behavior | Use For |
|---|---|---|
Miss::Warn | Log a warning and continue normally | Soft real-time nodes (logging, UI) |
Miss::Skip | Skip the node for this tick | Firm real-time nodes (video encoding) |
Miss::SafeMode | Call enter_safe_state() on the node | Motor controllers, safety nodes |
Miss::Stop | Stop the entire scheduler | Hard real-time safety-critical nodes |
The default is Miss::Warn.
Method Reference
Constructor & Configuration
| Method | Returns | Description |
|---|---|---|
Scheduler::new() | Scheduler | Create scheduler with auto-detected capabilities |
.tick_rate(freq) | Self | Set global tick rate (default: 100 Hz) |
.watchdog(Duration) | Self | Frozen node detection — auto-creates safety monitor |
.blackbox(size_mb) | Self | BlackBox flight recorder with n MB buffer |
.max_deadline_misses(n) | Self | Max deadline misses before emergency stop (default: 100) |
.prefer_rt() | Self | Try RT features, degrade gracefully if unavailable |
.require_rt() | Self | Enable RT features, panic if unavailable |
.verbose(bool) | Self | Enable/disable non-emergency logging (default: true) |
.with_recording() | Self | Enable record/replay |
Node Registration
| Method | Returns | Description |
|---|---|---|
.add(node) | NodeBuilder | Add node via fluent builder (recommended) |
Execution Control
| Method | Returns | Description |
|---|---|---|
.run() | HorusResult<()> | Main loop — run until Ctrl+C |
.run_for(duration) | HorusResult<()> | Run for specific duration, then shutdown |
.tick_once() | HorusResult<()> | Execute exactly one tick cycle (no loop, no sleep) |
.tick_once_nodes(&[names]) | HorusResult<()> | One tick cycle for specific nodes only |
.stop() | () | Stop the scheduler |
.is_running() | bool | Check if scheduler is currently running |
Monitoring & Statistics
| Method | Returns | Description |
|---|---|---|
.metrics() | Vec<NodeMetrics> | Performance metrics for all nodes |
.safety_stats() | Option<SafetyStats> | Budget overruns, deadline misses, watchdog expirations |
.node_list() | Vec<String> | List of registered node names |
.status() | String | Formatted status report (platform, RT features, safety) |
Runtime Configuration
| Method | Returns | Description |
|---|---|---|
.set_node_rate(name, freq) | &mut Self | Change per-node rate at runtime |
Next Steps
- Learn about Message Types for communication
- Explore Examples for complete applications
- Read the API Reference for detailed documentation
- See Scheduler Configuration for advanced options