Record & Replay

You need to capture a robot's execution and replay it later for debugging, regression testing, or what-if analysis. HORUS record/replay captures full node state and replays it with tick-perfect determinism.

When To Use This

You need to reproduce a bug that only occurs in specific conditions (field test, customer site)
You want to regression-test a new planner/controller against recorded sensor data
You need to compare two algorithm versions on the same input data
You are debugging a crash and need to step through the event timeline

Use BlackBox instead if you only need lightweight event logging for crash forensics. Record/Replay captures full node state (inputs/outputs) and is storage-heavy. The BlackBox captures event metadata and is always-on.

Prerequisites

Familiarity with Scheduler Configuration -- especially .with_recording() and .deterministic(true)
Understanding of Deterministic Mode for replay to produce identical results
Understanding of Topics for how recorded data is injected into shared memory

Overview

The record/replay system supports:

Full recording: Capture entire system execution
Tick-perfect replay: Reproduce exact behavior deterministically
Time travel: Jump to any recorded tick
Mixed replay: Combine recorded nodes with live execution
Playback control: Speed adjustment, tick ranges

Enabling Recording

Via Builder API

Enable recording through builder methods:

// simplified
use horus::prelude::*;

// Enable recording via builder API
let mut scheduler = Scheduler::new()
    .with_recording();

Via CLI

# Record during a run
horus run --record my_session my_project

When recording is enabled, the scheduler automatically captures each node's inputs, outputs, and timing.

Replaying Recordings

Full Replay

Replay an entire recorded session:

use horus::prelude::*;
use std::path::PathBuf;

let mut scheduler = Scheduler::replay_from(
    PathBuf::from("~/.local/share/horus/recordings/crash/scheduler@abc123.horus")
)?;
scheduler.run()?;

Time Travel

Jump to specific tick ranges during replay:

// Start at a specific tick
let mut scheduler = Scheduler::replay_from(path)?
    .start_at_tick(1500);

// Stop at a specific tick
let mut scheduler = Scheduler::replay_from(path)?
    .stop_at_tick(2000);

// Adjust playback speed (0.01 to 100.0)
let mut scheduler = Scheduler::replay_from(path)?
    .with_replay_speed(0.5);  // Half speed

Mixed Replay

Combine recorded nodes with live execution for what-if testing:

use horus::prelude::*;

let mut scheduler = Scheduler::new();

// Add replay nodes from recordings
scheduler.add_replay(
    PathBuf::from("recordings/Lidar@001.horus"),
    0,  // priority
)?;

// Add live nodes alongside
scheduler.add(live_controller).order(1).build()?;

scheduler.run()?;

Output Overrides

Override specific outputs during replay:

let mut scheduler = Scheduler::replay_from(path)?
    .with_override("sensor_node", "temperature", 25.0f32.to_le_bytes().to_vec());

CLI Commands

Record and replay from the command line:

# Start recording during a run
horus run --record my_session my_project

# List recording sessions
horus record list
horus record list --long  # Show file sizes and tick counts

# Show details of a session
horus record info my_session

# Replay a recording
horus record replay my_session
horus record replay my_session --start-tick 1000 --stop-tick 2000
horus record replay my_session --speed 0.5

# Compare two recording sessions
horus record diff session1 session2
horus record diff session1 session2 --limit 50

# Export to JSON or CSV
horus record export my_session --output data.json --format json
horus record export my_session --output data.csv --format csv

# Inject recorded nodes into a new run
horus record inject my_session --nodes camera_node,lidar_node
horus record inject my_session --all --loop

# Delete a recording session
horus record delete my_session
horus record delete my_session --force

Managing Recordings

// simplified
// List all recording sessions
let sessions = Scheduler::list_recordings()?;

// Delete a recording session
Scheduler::delete_recording("old_session")?;

`replay_from` vs `add_replay`

Method	Use Case	Clock
`Scheduler::replay_from(path)`	Full replay — all nodes from one recording	ReplayClock (recorded timestamps)
`scheduler.add_replay(path, priority)`	Mixed — replay some nodes, run others live	ReplayClock for replay nodes

When to use which:

Use replay_from() for post-mortem debugging — replay an entire session exactly as recorded
Use add_replay() for regression testing — replay recorded sensor data while running a new version of your planner/controller live

// Post-mortem: "what happened in production?"
let mut scheduler = Scheduler::replay_from("crash_session.hbag")?;
scheduler.run()?;

// Regression test: "does the new planner work with the same sensor data?"
let mut scheduler = Scheduler::new();
scheduler.add_replay("sensor_data.horus".into(), 0)?;  // recorded LiDAR + IMU
scheduler.add(NewPlannerV2::new()).order(1).build()?;    // live planner under test
scheduler.run()?;

Python Complete Recording Workflow

import horus

def sensor_tick(node):
    node.send("imu", horus.Imu(accel_x=0.0, accel_y=0.0, accel_z=9.81))

sensor = horus.Node(name="imu", pubs=[horus.Imu], tick=sensor_tick, rate=100)

# Step 1: Record a session
sched = horus.Scheduler(tick_rate=100, recording=True)
sched.add(sensor)
sched.run(duration=5.0)

# Step 2: Get recording files
files = sched.stop_recording()
print(f"Recorded to: {files}")

# Step 3: List and manage
for rec in sched.list_recordings():
    print(f"  Available: {rec}")

# Step 4: Full replay
sched2 = horus.Scheduler.replay_from(files[0])
sched2.run()

# Step 5: Time travel replay
sched3 = horus.Scheduler.replay_from(files[0])
sched3.start_at_tick(100)
sched3.stop_at_tick(400)
sched3.set_replay_speed(0.5)
sched3.run()

# Step 6: Mixed replay (recorded sensor + new controller)
sched4 = horus.Scheduler(tick_rate=100)
sched4.add_replay("recordings/imu@001.horus", priority=0)
sched4.add(horus.Node(tick=new_controller, rate=100, order=1))
sched4.run()

Note: Python supports the full replay API: Scheduler.replay_from(), add_replay(), start_at_tick(), stop_at_tick(), set_replay_speed(), and set_replay_override(). See examples below.

Design Decisions

Why record at the topic level instead of the node level?

Recording topic data (inputs/outputs) rather than internal node state means recordings are portable across code versions. You can replay recorded sensor data against a new planner without recompiling the sensor driver. This is the same approach used by ROS2's rosbag.

Why mixed replay instead of full-system-only replay?

The most common debugging workflow is: "replay the recorded sensors, but run my new controller live." Mixed replay enables this without re-recording. You swap out the node under test while keeping all other data identical.

Why .horus format instead of standard formats?

The .horus recording format preserves tick-level timing, shared memory layout, and type metadata. Standard formats (CSV, JSON) lose timing precision and type safety. Export to JSON/CSV is available via horus record export for analysis tools.

Trade-offs

Gain	Cost
Tick-perfect deterministic replay	Recordings grow with session length (not bounded like BlackBox)
Mixed replay enables what-if testing	Replaying with different code may produce different results (expected)
Time travel to any tick	Random access requires indexing, which adds to recording size
CLI tools for comparison and export	Custom `.horus` format requires HORUS tools to read

Common Errors

Symptom	Cause	Fix
Recording is empty	`.with_recording()` not set on the scheduler	Add `.with_recording()` to the scheduler builder
Replay produces different results	Code changed between recording and replay	Use the same binary version, or use mixed replay for the changed node
`replay_from()` fails with file not found	Incorrect recording path	Use `horus record list` to find available recordings
Mixed replay node does not receive data	Topic names do not match between recorded and live nodes	Verify topic names are identical (case-sensitive)
Replay runs instantly (no pacing)	Replay uses virtual time by default	Use `.with_replay_speed(1.0)` for real-time pacing
Large recording files filling disk	Long sessions with many topics	Use `horus record delete` to clean up, or record only specific sessions
`horus record diff` shows no differences	Sessions are identical	This confirms both runs produced the same output