Monitor
You need to observe your running robot system in real-time: see which nodes are active, watch message flow between topics, inspect performance metrics, and tune parameters without restarting. Here is how to use the HORUS Monitor.
The HORUS Monitor is under active development. Core monitoring features work (nodes, topics, graph, parameters, packages). Some functionality (remote deployment, recordings browser) is still being finalized.
When To Use This
- Debugging message flow between nodes ("is my publisher actually sending data?")
- Monitoring node performance and tick rates during development
- Live tuning of PID gains, speed limits, and other runtime parameters
- Remote monitoring of headless robots over SSH (TUI mode)
- Verifying system health before field deployment
Use Telemetry Export instead if you need to send metrics to external dashboards like Grafana or Prometheus.
Use Debugging Workflows instead if you need to diagnose a specific problem like deadline misses or panics.
Prerequisites
- A running HORUS application (
horus run) - A second terminal for the monitor (or access via browser from another device)
Quick Start
# Start your HORUS application
horus run
# In another terminal, start the monitor
horus monitor
Browser opens automatically to http://localhost:3000. On first run, you'll be prompted to set a password (or press Enter to skip).
# Custom port
horus monitor 8080
# Terminal UI mode (no browser needed)
horus monitor --tui
# Reset password
horus monitor --reset-password
How It Works
The monitor is a read-only observer that attaches to a running HORUS application without modifying its behavior. It reads data that the scheduler already writes as part of normal operation.
┌─────────────────────────────────────────────────────────────────────┐
│ HORUS Application │
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Node A │ │ Node B │ │ Node C │ Scheduler writes │
│ │ tick() │ │ tick() │ │ tick() │ NodeMetrics + topic │
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ metadata to SHM │
│ │ │ │ after each tick cycle │
│ ▼ ▼ ▼ │
│ ┌──────────────────────────────────────────┐ │
│ │ Shared Memory (SHM) │ │
│ │ ┌────────────┐ ┌───────────────────┐ │ │
│ │ │ NodeMetrics │ │ Topic Ring Buffers│ │ │
│ │ │ per node │ │ headers + data │ │ │
│ │ └────────────┘ └───────────────────┘ │ │
│ └──────────────────────┬───────────────────┘ │
└─────────────────────────┼───────────────────────────────────────────┘
│
mmap read-only│ (no copies, no syscalls)
│
┌─────────────────────────┼───────────────────────────────────────────┐
│ horus monitor │ │
│ ▼ │
│ ┌──────────────────────────────────┐ │
│ │ SHM Reader (mmap) │ Reads NodeMetrics, topic │
│ │ /proc scan for node discovery │ headers, log buffer │
│ └──────┬───────────────┬───────────┘ │
│ │ │ │
│ ┌─────▼─────┐ ┌────▼──────────────────────┐ │
│ │ TUI Mode │ │ Web Mode │ │
│ │ ratatui │ │ ┌──────────┐ ┌──────────┐ │ │
│ │ redraws │ │ │ Axum │ │WebSocket │ │ │
│ │ at 4 Hz │ │ │ HTTP │ │ push │ │ │
│ │ │ │ │ server │ │ at 4 Hz │ │ │
│ └───────────┘ │ └──────────┘ └──────────┘ │ │
│ └────────────────────────────-┘ │
└─────────────────────────────────────────────────────────────────────┘
Data flow step by step:
-
Scheduler writes metrics -- After each tick cycle, the scheduler updates
NodeMetrics(tick count, duration, deadline misses) in the node's SHM region. Topic ring buffer headers already contain pub/sub counts, pending messages, and drop counts as part of normal IPC operation. -
Monitor reads via mmap -- The monitor process opens the same SHM files read-only via
mmap. This is a pointer dereference, not a copy -- reads hit L1 cache when the data is recent. The monitor scans/procto discover running node processes and reads the SHM topics directory to enumerate active topics. -
Web UI via Axum -- In web mode, an Axum HTTP server runs on a separate thread. REST endpoints (
/api/nodes,/api/topics,/api/graph) return JSON snapshots. A WebSocket at/api/wspushes live updates to connected browsers at 4 Hz (every 250ms). -
TUI via ratatui -- In TUI mode, a crossterm/ratatui terminal UI polls for keyboard input at 100ms intervals and refreshes the display at 250ms intervals (4 Hz). No HTTP server is started.
The monitor never writes to SHM regions used by the application. The only write it performs is setting a verbose flag on a topic's SHM header when you enable topic debug logging from the TUI -- this is a single byte that the topic checks on each send()/recv().
Web Interface
The web monitor has 3 main tabs:
Monitor Tab
The main monitoring view with two sub-views:
List View — Shows nodes and topics in a grid layout:
- Nodes card: All running nodes with their status
- Topics card: Active message channels with sizes
Graph View — Interactive canvas showing:
- Nodes as circles connected to their topics
- Visual representation of the pub/sub network
- Helps answer "which nodes are talking to which topics?"
A status bar at the top always shows:
- Active Nodes count (hover for node list)
- Active Topics count (hover for topic list)
- Monitor port
What the dashboard looks like:
┌─────────────────────────────────────────────────────────────────┐
│ [Monitor] [Parameters] [Packages] ● 5 Nodes ● 8 Topics │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌─ Nodes ──────────────────┐ ┌─ Topics ──────────────────┐ │
│ │ ● imu_driver [RT] │ │ sensors.imu 4 msgs │ │
│ │ ● camera_node [Comp] │ │ sensors.lidar 12 msgs │ │
│ │ ● slam_engine [RT] │ │ cmd_vel 1 msg │ │
│ │ ● planner [Comp] │ │ map.grid 0 msgs │ │
│ │ ● motor_driver [RT] │ │ odom 2 msgs │ │
│ └──────────────────────────┘ └────────────────────────────┘ │
│ │
│ [List View] [Graph View] │
│ │
└─────────────────────────────────────────────────────────────────┘
In Graph View, nodes appear as circles and topics as labeled connection points. Arrows show publish direction. Hovering a node highlights all of its connected topics.
Parameters Tab
Live runtime parameter editor:
- Search parameters by name
- Add new parameters at runtime
- Edit existing values (changes apply immediately)
- Delete parameters
- Export all parameters to file
- Import parameters from file
Useful for tuning PID gains, speed limits, sensor thresholds without restarting.
Packages Tab
Browse and manage HORUS packages:
- Search the registry
- Install packages
- Manage environments
Terminal UI Mode
For SSH sessions and headless servers:
horus monitor --tui
The TUI provides 8 tabs navigated with arrow keys:
| Tab | Description |
|---|---|
| Overview | System health summary with log panel |
| Nodes | Running nodes with detailed metrics |
| Topics | Active topics and message flow |
| Network | Network connections and transport status |
| TransformFrame | TransformFrame protocol inspection |
| Packages | Package management |
| Params | Runtime parameter editor |
| Recordings | Session recordings browser |
What the TUI looks like:
┌─ HORUS Monitor ──────────────────────────────────────────────┐
│ [Overview] [Nodes] [Topics] [Network] [TF] [Pkg] [Par] [Rec]│
├──────────────────────────────────────────────────────────────┤
│ │
│ System Health: OK Uptime: 00:14:32 │
│ Active Nodes: 5/5 Tick Rate: 100 Hz │
│ │
│ ┌─ Node Status ────────────────────────────────────────┐ │
│ │ imu_driver ████████████████████░░ 92% budget │ │
│ │ camera_node ██████████░░░░░░░░░░░░ 45% budget │ │
│ │ slam_engine ██████████████████░░░░ 78% budget │ │
│ │ planner ████████░░░░░░░░░░░░░░ 35% budget │ │
│ │ motor_driver ██████████████░░░░░░░░ 62% budget │ │
│ └──────────────────────────────────────────────────────┘ │
│ │
│ ┌─ Log ─────────────────────────────────────────────── ┐ │
│ │ 14:32:01 [INFO] imu_driver: tick 142800 (0.4ms) │ │
│ │ 14:32:01 [INFO] slam_engine: map updated (2.1ms) │ │
│ │ 14:32:01 [WARN] camera_node: frame dropped │ │
│ └──────────────────────────────────────────────────────┘ │
│ │
│ ← → Navigate tabs ↑ ↓ Select Enter: Details q: Quit │
└──────────────────────────────────────────────────────────────┘
Navigate between tabs with left/right arrow keys. Within each tab, use up/down arrows to select items, Enter to open detail panels, and Esc to close them. Press q to quit, p to pause/resume updates, ? to show the help overlay.
Topic Debug Logging
In the Topics tab, press Enter on any topic to enable runtime debug logging. All send() and recv() calls on that topic will emit live log entries showing direction, IPC latency, and message summaries (if LogSummary is implemented). Press Esc to disable logging — zero overhead resumes immediately.
No code changes or recompilation required.
Performance Overhead
The monitor is designed to be always-on during development with negligible impact on your application.
| Component | Overhead | Notes |
|---|---|---|
| SHM metric reads | Sub-microsecond | mmap pointer dereference, hits L1/L2 cache |
/proc node scan | ~1ms per scan | Runs at 4 Hz in the monitor process, not in your application |
| HTTP server (Axum) | Own thread | Separate OS thread, does not compete with RT node threads |
| WebSocket push | 4 Hz (250ms) | JSON serialization of node/topic snapshots, ~2-5 KB per push |
| TUI redraw | 4 Hz (250ms) | Terminal write in monitor process only, event polling at 10 Hz |
| Topic debug logging | ~100ns per send()/recv() | Only when verbose is enabled on a specific topic. Writes one log entry per call to the global log buffer |
| Parameter reads | ~50ns | Lock-free atomic reads from RuntimeParams SHM |
| Total on application | Less than 0.1% CPU | Monitor runs as a separate process. The only cost inside your application is the metric writes the scheduler already does |
The scheduler writes NodeMetrics regardless of whether the monitor is running -- this data is used internally for deadline enforcement and watchdog detection. Running horus monitor adds zero overhead to your application's hot path. The monitor process itself typically uses 1-3% of a single CPU core.
When does overhead increase?
- Topic debug logging: Enabling verbose mode on a high-frequency topic (e.g., 1 kHz IMU) adds log entries at that rate. Each entry is ~100ns, but at 1 kHz that is 100 microseconds per second -- still negligible, but visible in profiling.
- Many WebSocket clients: Each connected browser receives the full snapshot. With 10+ simultaneous browser tabs, JSON serialization time may reach ~1ms per push cycle.
- Parameter writes: Setting parameters from the web UI triggers a write to the params SHM. This is a one-time cost per edit, not a recurring overhead.
Network Access
The monitor binds to all network interfaces (0.0.0.0), so you can access it from:
- Same machine:
http://localhost:3000 - Any device on the network:
http://<your-ip>:3000
Always set a password when the monitor is network-accessible.
Security
The monitor supports password-based authentication for networked deployments.
Setup
On first run, set a password (or press Enter to skip authentication):
horus monitor
[SECURITY] HORUS Monitor - First Time Setup
Password: ********
Confirm password: ********
[SUCCESS] Password set successfully!
Reset password anytime:
horus monitor --reset-password
How Authentication Works
When a password is set:
- The web UI shows a login page before granting access
- All API endpoints require a valid session token (except
/api/login) - Sessions expire after 1 hour of inactivity
- Failed login attempts are rate-limited
When no password is set (Enter pressed at setup):
- All endpoints are accessible without authentication
- Suitable for local development only
API Authentication
# Login — returns a session token
curl -X POST http://localhost:3000/api/login \
-H "Content-Type: application/json" \
-d '{"password": "your_password"}'
# Returns: {"token": "abc123..."}
# Use token for API requests
curl http://localhost:3000/api/nodes \
-H "Authorization: Bearer abc123..."
# Logout
curl -X POST http://localhost:3000/api/logout \
-H "Authorization: Bearer abc123..."
Security Details
| Feature | Value |
|---|---|
| Password hashing | Argon2id |
| Session timeout | 1 hour inactivity |
| Rate limiting | 5 attempts per 60 seconds |
| Token size | 256-bit random (base64-encoded) |
Password hash stored at ~/.horus/dashboard_password.hash.
For production deployments, consider placing a reverse proxy with TLS (e.g., nginx) in front of the monitor.
Recovery
If locked out:
# Option 1: Reset via CLI
horus monitor --reset-password
# Option 2: Delete the password hash file
rm ~/.horus/dashboard_password.hash
horus monitor # Re-prompts for password setup
API Endpoints
The monitor exposes a REST API (authenticated when a password is set):
| Endpoint | Method | Description |
|---|---|---|
/api/status | GET | System health status |
/api/nodes | GET | Running nodes info |
/api/topics | GET | Active topics |
/api/graph | GET | Node-topic graph |
/api/network | GET | Network connections |
/api/logs/all | GET | All logs |
/api/logs/node/:name | GET | Logs for specific node |
/api/logs/topic/:name | GET | Logs for specific topic |
/api/params | GET | List parameters |
/api/params/:key | GET/POST/DELETE | Get/set/delete parameter |
/api/params/export | POST | Export all parameters |
/api/params/import | POST | Import parameters |
/api/packages/registry | GET | Search packages |
/api/packages/install | POST | Install package |
/api/packages/uninstall | POST | Uninstall package |
/api/recordings | GET | List recordings |
/api/login | POST | Authenticate |
/api/logout | POST | End session |
Common Scenarios
Debugging Message Flow
"My subscriber isn't getting messages"
- Open Monitor tab, switch to Graph View
- Is there an arrow from publisher -> topic -> subscriber?
- If not: check topic name matches or node isn't running
"The robot is running slow"
- Check nodes list for high CPU usage
- Check tick rates — which node can't keep up?
- Use logs endpoint to check for slow tick warnings
Live Parameter Tuning
Tuning PID controller:
- Open Parameters tab
- Search for
pid - Edit
pid.kpvalue — change applies instantly - Watch robot behavior, adjust until optimal
- Export final values with Export button
Common Errors
| Symptom | Cause | Fix |
|---|---|---|
| Monitor shows nothing | HORUS application not running | Start your app first with horus run |
| Cannot access from another device | Devices on different networks or firewall blocking | Ensure same network, run sudo ufw allow 3000 |
| Port already in use | Another monitor or process on port 3000 | Use a different port: horus monitor 8080 |
| Locked out of password | Forgotten password | Run horus monitor --reset-password or delete ~/.horus/dashboard_password.hash |
| TUI rendering broken | Terminal does not support 256 colors | Use a modern terminal (kitty, alacritty, wezterm) or try TERM=xterm-256color horus monitor --tui |
| API returns 401 Unauthorized | Session expired or invalid token | Re-authenticate via /api/login endpoint |
Design Decisions
Why SHM-based, not network-based
Traditional monitoring tools (e.g., Prometheus exporters, ROS2 introspection) add network hops to collect metrics. HORUS chose shared memory because:
- Zero overhead when not monitoring. The scheduler writes
NodeMetricsto SHM regardless -- it uses the same data for deadline enforcement. No extra serialization, no sockets, no packets. - No configuration. The monitor auto-discovers running nodes by scanning
/procand the SHM topics directory. No need to configure exporters, ports, or scrape endpoints. - Works offline. SHM monitoring works without any network stack, which matters on embedded systems and in containers without network access.
If you need to send metrics to an external system (Grafana, Prometheus, Datadog), use Telemetry Export -- it reads the same SHM data and forwards it over the network.
Why both Web and TUI
The monitor provides two interfaces because robots are developed and deployed in different environments:
- Web UI is for development machines with a browser available. It supports the interactive graph view, drag-and-drop parameter files, and multiple team members can open it simultaneously from different devices on the network.
- TUI is for SSH sessions into headless robots, CI environments, and embedded systems. It requires only a terminal -- no browser, no X11, no port forwarding. The TUI has feature parity with the web UI for reading data (nodes, topics, params, logs) and additionally supports topic debug logging via mmap.
Both interfaces read from the same SHM data source. You can run them simultaneously -- horus monitor in one terminal for the web UI and horus monitor --tui in another for the TUI.
Why password auth, not tokens or mTLS
The monitor uses simple password-based sessions (Argon2id hashing, 256-bit random session tokens) instead of API keys, OAuth, or mutual TLS:
- Single-user tool. The monitor is typically accessed by one developer or a small team on a local network. OAuth and mTLS add complexity with no benefit.
- No external dependencies. Password hashing is done locally with Argon2id. No identity provider, no certificate authority, no token refresh flows.
- Quick setup. First run prompts for a password. No config files, no key generation, no certificate management.
For production deployments exposed to the internet, place a reverse proxy with TLS (e.g., nginx) in front of the monitor rather than building TLS into the monitor itself.
Why read-only access to SHM
The monitor opens SHM files with read-only mmap. It cannot modify topic buffers, corrupt node state, or interfere with the scheduler. The single exception is the topic verbose flag (one byte per topic), which enables debug logging. This is intentional:
- Safety. A monitoring tool must never be able to crash or corrupt the system it observes. Read-only mmap enforces this at the OS level.
- No synchronization needed. The monitor reads atomic counters and ring buffer headers that the scheduler writes. No locks, no contention, no priority inversion risk for RT nodes.
Trade-offs
| Choice | Benefit | Cost |
|---|---|---|
| SHM-based monitoring | Zero overhead, no network config, works offline | Only works on the same machine (use Telemetry Export for remote) |
| Separate monitor process | Cannot crash the application, clean resource isolation | Must be started separately (horus monitor), adds one process |
| Web + TUI dual interface | Works everywhere: browser, SSH, headless, embedded | Two codebases to maintain, feature parity requires discipline |
| Password auth (not mTLS) | Simple setup, no PKI infrastructure needed | No per-user access control, no audit trail beyond rate limiting |
| Read-only SHM access | Cannot corrupt application state, no lock contention | Cannot inject test data or modify parameters from the SHM side (uses params API instead) |
| 4 Hz refresh rate | Smooth real-time feel without CPU waste | Cannot capture sub-250ms transient events (use BlackBox or topic debug logging for those) |
/proc scan for node discovery | Works without any registration protocol | Linux-specific, ~1ms per scan, may show stale entries briefly after node crash |
See Also
- CLI Reference — Full
horus monitorcommand options - Parameters Guide — Runtime parameter management in detail
- Debugging Workflows — Step-by-step diagnosis for deadline misses, panics, and slowdowns
- Telemetry Export — Export metrics to external systems (Grafana, Prometheus)
- Operations — Production monitoring and deployment