§5.4 — Multi-Modal Payloads v1.6

Earlier RCAN protocol drafts had JSON-only message payloads. This meant binary data (images, audio, video) had to be transported outside the RCAN trust boundary — breaking the signed-message audit trail. Subsequent revisions add a media_chunks[] field to the message envelope so binary data is hashed and bound to the Ed25519 signature.

Audit trail integrity: The signed RCAN message proves what data was sent via the hash_sha256 on each chunk. The actual bytes may be stored separately (reference mode) or inline (base64 mode). Either way, the hash in the commitment record is tamper-evident.

media_chunks[] Field Schema

The media_chunks field is an optional array on RCANMessage (top-level envelope, not inside payload). It may contain 1–N chunks.

// RCANMessage envelope extension (v1.6)
{
  "id":           "uuid",
  "type":         10,        // e.g. TRAINING_DATA
  "payload":      { ... },
  ...
  "media_chunks": [          // NEW in v1.6
    {
      "chunk_id":    "uuid-v4",          // Required: unique ID for this chunk
      "mime_type":   "image/jpeg",       // Required: RFC 2046 MIME type
      "encoding":    "base64",           // Required: "base64" | "ref"
      "hash_sha256": "sha256:4a3f...",   // Required: SHA-256 of the raw binary data
      "size_bytes":  143872,             // Required: byte count of raw binary data
      "data_b64":    "...",              // Required if encoding=="base64"
      "ref_url":     null,               // Required if encoding=="ref"; null otherwise
      "ref_expires": null                // Unix timestamp: when ref_url becomes invalid
    }
  ]
}

Field Reference

FieldRequiredTypeDescription
chunk_idUUID stringUnique identifier for this media chunk. Referenced in CommitmentRecord.
mime_typestringRFC 2046 MIME type (e.g. image/jpeg, audio/wav, video/mp4).
encoding"base64" or "ref"How the data is delivered: inline or by reference URL.
hash_sha256string (sha256:...)SHA-256 of the raw bytes (before base64). MUST be verified by receiver before use.
size_bytesintegerSize of the raw binary data in bytes.
data_b64if base64stringBase64-encoded binary data. Only present when encoding=="base64".
ref_urlif refstring (URL)Signed URL to fetch the data. Only present when encoding=="ref".
ref_expiresif refintegerUnix timestamp when ref_url expires. Default TTL: 300s (5 min).

Inline Mode (Base64)

Use inline mode for small binary payloads. The data is embedded directly in the RCAN message JSON as a base64 string.

Size Limit

  • Inline mode MUST NOT be used for payloads exceeding 64 KB (65,536 bytes raw data).
  • Base64 encoding overhead: ~33%; a 64 KB binary becomes ~86 KB in the JSON.
  • Receivers MUST reject inline chunks where size_bytes > 65536.

Example: Image Status Response

{
  "id":   "550e8400-...",
  "type": 3,           // STATUS
  "payload": {
    "mode": "active",
    "camera_frame": { "chunk_id": "abc-123", "note": "see media_chunks" }
  },
  "media_chunks": [
    {
      "chunk_id":    "abc-123",
      "mime_type":   "image/jpeg",
      "encoding":    "base64",
      "hash_sha256": "sha256:9a4f8c...",
      "size_bytes":  43520,
      "data_b64":    "/9j/4AAQSkZJRg..."   // Base64-encoded JPEG
    }
  ]
}

Receiver Verification

// Receiver MUST verify hash before processing
import hashlib, base64

raw_bytes = base64.b64decode(chunk["data_b64"])
computed = "sha256:" + hashlib.sha256(raw_bytes).hexdigest()
assert computed == chunk["hash_sha256"], "Media chunk hash mismatch — reject"

Reference Mode (ref_url)

Use reference mode for large payloads. The RCAN message carries only the hash and a signed URL; the actual bytes are fetched separately by the receiver.

Signed URL Generation

The sending robot generates a signed URL using HMAC-SHA256 over the chunk content:

// Signed URL format
GET /api/v1/media/{chunk_id}?expires={unix_ts}&sig={hmac_hex}

// HMAC key: derived from robot's Ed25519 private key (first 32 bytes)
// HMAC message: "GET:/api/v1/media/{chunk_id}:{expires}"
// Default TTL: 300 seconds (5 minutes)

// Example ref_url:
"ref_url": "https://robot-042.local/api/v1/media/abc-123?expires=1741003900&sig=4a3f8c..."

Size Limits by Transport

TransportMax Size (Reference Mode)Notes
RCAN-HTTP64 MBHTTPS fetch; robot must serve the media endpoint
RCAN-Compact512 KBMedia must be fetched via HTTP; ref_url in compact message
RCAN-MinimalNot supportedNo media chunks in 32-byte minimal frames

Robot Media Endpoint

// Required endpoint (v1.6 multi-modal robots):
GET /api/v1/media/{chunk_id}

// Query params:
//   expires  — Unix timestamp (required when sig present)
//   sig      — HMAC-SHA256 signature (required)

// Response:
HTTP 200 OK
Content-Type: image/jpeg          // mime_type from media_chunks entry
Content-Length: 43520
X-Chunk-Hash: sha256:9a4f8c...    // SHA-256 for receiver verification

[binary data]

// Error responses:
HTTP 404  chunk_id not found
HTTP 410  URL expired (expires < now)
HTTP 403  signature invalid

CommitmentRecord Extension

Every message with media_chunks MUST extend its CommitmentRecord to include the media hashes. This ensures the audit trail proves exactly what binary data was sent.

// CommitmentRecord v1.6 extension
{
  "commitment_id":   "uuid",
  "msg_id":          "uuid",
  "timestamp":       1741000000,
  "cmd":             "sensor_snapshot",
  "operator":        "user-uuid",
  "commitment_hash": "sha256:...",    // HMAC over text payload fields
  "media_hashes":    {              // NEW in v1.6 — included in HMAC chain
    "abc-123": "sha256:9a4f8c...",    // chunk_id → hash_sha256
    "def-456": "sha256:1b2c3d..."
  },
  "chain_hash":      "sha256:..."    // HMAC over (commitment_hash + media_hashes)
}
HMAC chain integrity: The chain_hash is computed over both the text payload commitment and the media_hashes dict. Omitting media_hashes from the HMAC chain MUST be treated as a commitment integrity violation.

Streaming Mode

For continuous video streams (e.g. a robot camera feed), use SENSOR_DATA (type 7) with streaming fields on the message envelope:

// SENSOR_DATA streaming message (v1.6)
{
  "id":          "uuid-per-chunk",
  "type":        7,              // SENSOR_DATA
  "payload": {
    "sensor":    "front_camera",
    "stream_id": "uuid-for-this-stream",  // Same for all chunks in stream
    "chunk_index": 42,                    // 0-based frame counter
    "is_final":   false                   // true on last frame
  },
  "media_chunks": [
    {
      "chunk_id":    "frame-042-uuid",
      "mime_type":   "image/jpeg",
      "encoding":    "base64",
      "hash_sha256": "sha256:...",
      "size_bytes":  38912,
      "data_b64":    "..."
    }
  ]
}

Stream Lifecycle

  • The stream_id is assigned by the sender at stream start and MUST be the same across all chunks.
  • Receivers buffer frames using chunk_index for ordering.
  • The frame with is_final: true signals stream end. Receivers MUST NOT expect further frames for this stream_id.
  • Dropped frames are logged but do not invalidate the stream — video streams are QoS 0 (fire-and-forget).
  • Each frame produces its own CommitmentRecord with its own media_hashes. The audit trail is per-frame, not per-stream.

TRAINING_DATA — JSON-Only Deprecated

Deprecation notice: In the current RCAN protocol, sending a TRAINING_DATA (type 10) message with image, video, or audio data encoded inside the JSON payload field (e.g., as a base64 string within the payload dict) is deprecated. All image/video/audio training data MUST be delivered via media_chunks[].

Why?

  • JSON-embedded binary is outside the CommitmentRecord's media_hashes HMAC chain → audit trail does not prove what data was collected.
  • EU AI Act Article 10 requires verifiable provenance for training data. media_chunks provides this via hash binding.
  • Chunked media supports the training consent token flow (see §17 Training Data Consent) with per-chunk consent attribution.

v1.6 TRAINING_DATA Required Format

// v1.6 TRAINING_DATA — MUST use media_chunks for binary data
{
  "id":     "uuid",
  "type":   10,     // TRAINING_DATA
  "payload": {
    "dataset_id":    "dataset-uuid",
    "consent_token": "training-consent-token-uuid",   // Required per §17
    "data_categories": ["video", "audio"],
    "collection_context": "warehouse-sweep-2026-03-16"
    // Do NOT embed image/video/audio data here
  },
  "media_chunks": [                                   // Required for binary data
    {
      "chunk_id":    "video-frame-001",
      "mime_type":   "video/mp4",
      "encoding":    "ref",
      "hash_sha256": "sha256:...",
      "size_bytes":  2097152,
      "ref_url":     "https://robot.local/api/v1/media/video-frame-001?...",
      "ref_expires": 1741003900
    }
  ]
}

Migration Timeline

VersionJSON-in-payload binarymedia_chunks binary
v1.5 and earlierAllowed (no standard alternative)Not available
v1.6Deprecated — triggers WARNING audit event✅ Required for image/video/audio
v1.7 (planned)Rejected — COMMAND_NACK with TRAINING_DATA_NO_MEDIA_CHUNKS✅ Required

rcan-py Implementation

from rcan.multimodal import MediaChunk, encode_inline, encode_ref
from rcan.message import RCANMessage, MessageType

# Inline mode (small image)
with open("snapshot.jpg", "rb") as f:
    image_bytes = f.read()

chunk = encode_inline(
    data=image_bytes,
    mime_type="image/jpeg",
)
# chunk.chunk_id, chunk.hash_sha256 set automatically

msg = RCANMessage(
    type=MessageType.SENSOR_DATA,
    payload={"sensor": "front_camera"},
    media_chunks=[chunk],
    ...
)

# Reference mode (large video)
chunk_ref = encode_ref(
    chunk_id="video-001",
    mime_type="video/mp4",
    size_bytes=10_000_000,
    hash_sha256=precomputed_hash,
    signing_key=robot_private_key,
    ttl_seconds=300,
    base_url="https://robot.local",
)
# chunk_ref.ref_url is signed and ready

# Verify received chunk
from rcan.multimodal import verify_chunk_hash
raw = base64.b64decode(received_chunk.data_b64)
verify_chunk_hash(raw, received_chunk.hash_sha256)  # raises MediaHashError on mismatch

Implementation Notes

  • Hash before base64: Always compute hash_sha256 over the raw bytes, not the base64 string. Receivers must decode base64 before verifying.
  • ref_url over plaintext HTTP: Robots on isolated local networks MAY serve ref_url over plain HTTP (http://robot.local/...) but MUST still include the HMAC signature query parameter. HTTPS is strongly recommended for any network-reachable deployment.
  • CommitmentRecord ordering: media_hashes MUST be included in the HMAC chain before chain_hash is computed. Implementations that compute chain_hash without media_hashes are non-conformant from v1.6.
  • Large chunk limits: Reference-mode chunks larger than 64 MB over RCAN-HTTP MUST be split into multiple chunks (max 64 MB each). Use a shared stream_id and sequential chunk_index.
  • RCAN-Compact and media: Over RCAN-Compact, only encoding: "ref" is supported (ref_url compressed in CBOR). Inline base64 in RCAN-Compact is not permitted.

See Also