AudioFrame

Audio data from a microphone or audio source. Fixed-size Pod type for zero-copy shared memory transport. Supports mono, stereo, and multi-channel microphone arrays.

When to Use

Use AudioFrame when your robot has microphones and needs to share audio between nodes -- for example, between a microphone driver node, a speech recognition node, and an anomaly detection node.

Common use cases:

  • Voice commands -- speech-to-text for human-robot interaction
  • Anomaly detection -- motor fault detection by sound
  • Acoustic SLAM -- using sound for localization
  • Teleoperation -- two-way audio between operator and robot

ROS2 Equivalent

audio_common_msgs/AudioData -- similar concept, but HORUS uses a fixed-size Pod buffer for zero-copy SHM instead of variable-length serialized bytes.

Quick Start

Rust

use horus::prelude::*;

// Publish audio from a microphone
let topic: Topic<AudioFrame> = Topic::new("mic")?;
let samples: Vec<f32> = capture_audio(); // your mic driver
let frame = AudioFrame::mono(16000, &samples);
topic.send(frame);

// Receive and process
let frame = topic.recv().unwrap();
println!("Got {} samples at {}Hz, {:.1}ms",
    frame.num_samples, frame.sample_rate, frame.duration_ms());

Python

import horus

def process_audio(node):
    frame = node.get("mic")
    if frame:
        samples = frame.samples        # list of floats
        rate = frame.sample_rate        # e.g. 16000
        duration = frame.duration_ms    # e.g. 10.0

        # Feed to speech recognition
        text = whisper.transcribe(samples, sr=rate)

node = horus.Node("speech", subs=["mic"], tick=process_audio, rate=100)
horus.run(node)

Constructors

Rust

ConstructorDescription
AudioFrame::mono(sample_rate, &samples)Single-channel audio
AudioFrame::stereo(sample_rate, &samples)Interleaved stereo (L R L R...)
AudioFrame::multi_channel(sample_rate, channels, &samples)Microphone arrays (4, 8, 16 mics)

Python

# Mono microphone at 16kHz
frame = horus.AudioFrame(sample_rate=16000, samples=[0.1, -0.2, 0.3])

# Stereo at 48kHz
frame = horus.AudioFrame(sample_rate=48000, channels=2, samples=interleaved)

# 4-channel mic array
frame = horus.AudioFrame(sample_rate=16000, channels=4, samples=array_data)

# With metadata
frame = horus.AudioFrame(
    sample_rate=16000,
    samples=data,
    frame_id="mic_left",
    timestamp_ns=horus.timestamp_ns()
)

Fields

FieldTypeUnitDescription
samples[f32; 4800]--Audio sample buffer (Rust), list[float] (Python -- only valid samples). Range: [-1.0, 1.0] (F32)
num_samplesu32--Number of valid samples in buffer
sample_rateu32HzSample rate (8000, 16000, 44100, 48000)
channelsu8--Channel count (1=mono, 2=stereo, N=mic array)
encodingu8--Audio encoding (0=F32, 1=I16)
timestamp_nsu64nsCapture timestamp in nanoseconds
frame_id[u8; 32]--Source identifier (e.g. "mic_left")

Computed Properties

PropertyTypeDescription
duration_ms()f64Duration of this audio chunk in milliseconds
frame_count()u32Number of audio frames (samples per channel)
valid_samples()&[f32]Slice of only the valid samples (Rust)

Buffer Size

MAX_AUDIO_SAMPLES = 4800 -- enough for 48kHz at 100ms chunks. For common configurations:

Sample RateChunk DurationSamples NeededFits?
8kHz100ms800Yes
16kHz20ms320Yes
16kHz100ms1600Yes
44.1kHz20ms882Yes
48kHz100ms4800Yes (max)
48kHz stereo50ms4800Yes (max)

For longer chunks, send multiple frames.

Multi-Channel Audio

For microphone arrays, samples are interleaved: channel 0 sample 0, channel 1 sample 0, channel 0 sample 1, channel 1 sample 1, etc.

// 4-channel mic array, 16kHz, 10ms chunk = 640 samples
let samples = capture_4ch_audio(); // [ch0_s0, ch1_s0, ch2_s0, ch3_s0, ch0_s1, ...]
let frame = AudioFrame::multi_channel(16000, 4, &samples);
assert_eq!(frame.frame_count(), 160); // 640 / 4 channels

AudioEncoding

The encoding format for audio samples in the buffer.

VariantValueDescription
F32032-bit float, range [-1.0, 1.0] (normalized)
I16116-bit signed integer, range [-32768, 32767] (PCM)
use horus::prelude::*;

// Float encoding (default, best for processing)
let frame = AudioFrame::mono(16000, &float_samples);
assert_eq!(frame.encoding, AudioEncoding::F32 as u8);

// Integer encoding (common for hardware capture)
let mut frame = AudioFrame::default();
frame.encoding = AudioEncoding::I16 as u8;

Wire Format

AudioFrame is a fixed-size Pod type (~19.2 KB). It uses the same zero-copy SHM transport as all other Pod messages -- no serialization overhead.

[f32 x 4800] samples    = 19200 bytes
[u32] num_samples        =     4 bytes
[u32] sample_rate        =     4 bytes
[u8]  channels           =     1 byte
[u8]  encoding           =     1 byte
[u8 x 2] padding         =     2 bytes
[u64] timestamp_ns       =     8 bytes
[u8 x 32] frame_id      =    32 bytes
Total                    = 19252 bytes