Tensor

A lightweight tensor descriptor for zero-copy ML data sharing across nodes and processes.

use horus::prelude::*;

Overview

Tensor is a lightweight descriptor that references data in shared memory. Only the descriptor is transmitted through topics — the actual tensor data stays in-place, enabling zero-copy transport for large ML payloads.

You typically obtain a Tensor in one of two ways:

  1. Allocate from a TensorPool (via Topic<Tensor> or a manual pool) -- see TensorPool API
  2. Receive from a Topic -- another node sends it, you read the descriptor and access the backing data

Methods

MethodReturn TypeDescription
shape()&[u64]Tensor dimensions (e.g., [1080, 1920, 3])
strides()&[u64]Byte strides per dimension
numel()u64Total number of elements
nbytes()u64Total size in bytes (numel * dtype.element_size())
dtype()TensorDtypeElement data type
device()DeviceDevice location (CPU or CUDA)
is_cpu()boolTrue if data resides on CPU / shared memory
is_cuda()boolTrue if device descriptor is set to CUDA
is_contiguous()boolTrue if memory layout is C-contiguous
view(new_shape)Option<Self>Reshape without copying (fails if not contiguous or element count changes)
slice_first_dim(start, end)Option<Self>Slice along the first dimension, adjusting strides

Reshape and Slice

let topic: Topic<Tensor> = Topic::new("model.input")?;

if let Some(handle) = topic.recv_handle() {
    // Reshape a flat 1D tensor into a batch of images
    if let Some(reshaped) = handle.view(&[4, 3, 224, 224]) {
        println!("Batch shape: {:?}", reshaped.tensor().shape()); // [4, 3, 224, 224]
    }

    // Take the first 2 items from a batch
    if let Some(sliced) = handle.slice_first_dim(0, 2) {
        println!("Sliced shape: {:?}", sliced.tensor().shape()); // [2, 3, 224, 224]
    }
}

TensorDtype

Supported element types with sizes and common use cases:

DtypeSizeUse Case
F324 bytesML training and inference
F648 bytesHigh-precision computation
F162 bytesMemory-efficient inference
BF162 bytesTraining on modern GPUs
I81 byteQuantized inference
I162 bytesAudio, sensor data
I324 bytesGeneral integer
I648 bytesLarge signed values
U81 byteImages
U162 bytesDepth sensors (mm)
U324 bytesLarge indices
U648 bytesCounters, timestamps
Bool1 byteMasks

TensorDtype Methods

let dtype = TensorDtype::F32;

// Size in bytes
assert_eq!(dtype.element_size(), 4);

// Display (lowercase string representation)
println!("{}", dtype); // "float32"

// Parse from string — accepts common aliases
let parsed = TensorDtype::parse("float32").unwrap(); // F32
let parsed = TensorDtype::parse("f16").unwrap();     // F16
let parsed = TensorDtype::parse("uint8").unwrap();   // U8
let parsed = TensorDtype::parse("bool").unwrap();     // Bool

Device

Pod-safe device descriptor supporting CPU and CUDA device tags. Device is metadata only — Device::cuda(N) tags a tensor with a device target but does not allocate GPU memory (GPU tensor pools are not yet implemented).

// Constructors
let cpu = Device::cpu();
let gpu0 = Device::cuda(0);  // Descriptor only — no GPU allocation

// Check device type
assert!(cpu.is_cpu());
assert!(gpu0.is_cuda());

// Display
println!("{}", gpu0); // "cuda:0"

// Parse from string
let dev = Device::parse("cpu").unwrap();
let dev = Device::parse("cuda:0").unwrap();

ML Pipeline Example

A complete example showing a camera node feeding frames to an inference node via Topic<Tensor>:

use horus::prelude::*;

// Producer: camera capture node
node! {
    CameraNode {
        pub { frames: Tensor -> "camera.rgb" }
        data { frame_count: u64 = 0 }

        tick {
            // Allocate from the topic's auto-managed pool
            if let Ok(mut handle) = self.frames.alloc_tensor(
                &[480, 640, 3],
                TensorDtype::U8,
                Device::cpu(),
            ) {
                let pixels = handle.data_slice_mut().unwrap();
                // ... fill pixel data from camera driver ...

                self.frames.send_handle(&handle);
                self.frame_count += 1;
            }
        }
    }
}

// Consumer: inference node
node! {
    InferenceNode {
        sub { frames: Tensor -> "camera.rgb" }
        pub { detections: GenericMessage -> "model.detections" }

        tick {
            if let Some(handle) = self.frames.recv_handle() {
                let data = handle.data_slice().unwrap(); // Zero-copy access
                let shape = handle.shape();
                hlog!(debug, "Frame: {:?}, {} bytes", shape, handle.nbytes());

                // Run inference, publish results ...
            }
        }
    }
}

Python API

PyTensorHandle provides seamless interop with NumPy, PyTorch, and JAX.

Creating Tensors

import horus
import numpy as np

# From NumPy (zero-copy when possible)
arr = np.zeros((480, 640, 3), dtype=np.float32)
handle = horus.PyTensorHandle.from_numpy(arr)

# From PyTorch
import torch
t = torch.randn(1, 3, 224, 224)
handle = horus.PyTensorHandle.from_torch(t)

Converting Back

# To NumPy (zero-copy)
arr = handle.to_numpy()

# To PyTorch
tensor = handle.to_torch()

# To JAX
jax_arr = handle.to_jax()

Properties and Methods

Property / MethodDescription
shapeTuple of dimensions
dtypeData type string (e.g., "float32")
deviceDevice descriptor string (e.g., "cpu")
is_contiguous()Check C-contiguous layout
view(new_shape)Reshape without copying
slice(start, end)Slice along first dimension

Python ML Pipeline

import horus
import numpy as np

# Subscribe to camera frames
topic = horus.Topic("camera.rgb", horus.Tensor)

while True:
    handle = topic.recv_handle()
    if handle is not None:
        # Zero-copy to NumPy for processing
        frame = handle.to_numpy()
        print(f"Frame shape: {frame.shape}, dtype: {frame.dtype}")

        # Or pass directly to PyTorch model
        import torch
        tensor = handle.to_torch()
        # output = model(tensor.unsqueeze(0))

See Also