§5.4 — Multi-Modal Payloads v1.6
Earlier RCAN protocol drafts had JSON-only message payloads. This meant binary data (images, audio, video) had to be transported outside the RCAN trust boundary — breaking the signed-message audit trail. Subsequent revisions add a media_chunks[] field to the message envelope so binary data is hashed and bound to the Ed25519 signature.
hash_sha256 on each chunk. The actual bytes may be stored separately (reference mode) or inline (base64 mode). Either way, the hash in the commitment record is tamper-evident.
media_chunks[] Field Schema
The media_chunks field is an optional array on RCANMessage (top-level envelope, not inside payload). It may contain 1–N chunks.
// RCANMessage envelope extension (v1.6)
{
"id": "uuid",
"type": 10, // e.g. TRAINING_DATA
"payload": { ... },
...
"media_chunks": [ // NEW in v1.6
{
"chunk_id": "uuid-v4", // Required: unique ID for this chunk
"mime_type": "image/jpeg", // Required: RFC 2046 MIME type
"encoding": "base64", // Required: "base64" | "ref"
"hash_sha256": "sha256:4a3f...", // Required: SHA-256 of the raw binary data
"size_bytes": 143872, // Required: byte count of raw binary data
"data_b64": "...", // Required if encoding=="base64"
"ref_url": null, // Required if encoding=="ref"; null otherwise
"ref_expires": null // Unix timestamp: when ref_url becomes invalid
}
]
} Field Reference
| Field | Required | Type | Description |
|---|---|---|---|
chunk_id | ✅ | UUID string | Unique identifier for this media chunk. Referenced in CommitmentRecord. |
mime_type | ✅ | string | RFC 2046 MIME type (e.g. image/jpeg, audio/wav, video/mp4). |
encoding | ✅ | "base64" or "ref" | How the data is delivered: inline or by reference URL. |
hash_sha256 | ✅ | string (sha256:...) | SHA-256 of the raw bytes (before base64). MUST be verified by receiver before use. |
size_bytes | ✅ | integer | Size of the raw binary data in bytes. |
data_b64 | if base64 | string | Base64-encoded binary data. Only present when encoding=="base64". |
ref_url | if ref | string (URL) | Signed URL to fetch the data. Only present when encoding=="ref". |
ref_expires | if ref | integer | Unix timestamp when ref_url expires. Default TTL: 300s (5 min). |
Inline Mode (Base64)
Use inline mode for small binary payloads. The data is embedded directly in the RCAN message JSON as a base64 string.
Size Limit
- Inline mode MUST NOT be used for payloads exceeding 64 KB (65,536 bytes raw data).
- Base64 encoding overhead: ~33%; a 64 KB binary becomes ~86 KB in the JSON.
- Receivers MUST reject inline chunks where
size_bytes > 65536.
Example: Image Status Response
{
"id": "550e8400-...",
"type": 3, // STATUS
"payload": {
"mode": "active",
"camera_frame": { "chunk_id": "abc-123", "note": "see media_chunks" }
},
"media_chunks": [
{
"chunk_id": "abc-123",
"mime_type": "image/jpeg",
"encoding": "base64",
"hash_sha256": "sha256:9a4f8c...",
"size_bytes": 43520,
"data_b64": "/9j/4AAQSkZJRg..." // Base64-encoded JPEG
}
]
} Receiver Verification
// Receiver MUST verify hash before processing
import hashlib, base64
raw_bytes = base64.b64decode(chunk["data_b64"])
computed = "sha256:" + hashlib.sha256(raw_bytes).hexdigest()
assert computed == chunk["hash_sha256"], "Media chunk hash mismatch — reject" Reference Mode (ref_url)
Use reference mode for large payloads. The RCAN message carries only the hash and a signed URL; the actual bytes are fetched separately by the receiver.
Signed URL Generation
The sending robot generates a signed URL using HMAC-SHA256 over the chunk content:
// Signed URL format
GET /api/v1/media/{chunk_id}?expires={unix_ts}&sig={hmac_hex}
// HMAC key: derived from robot's Ed25519 private key (first 32 bytes)
// HMAC message: "GET:/api/v1/media/{chunk_id}:{expires}"
// Default TTL: 300 seconds (5 minutes)
// Example ref_url:
"ref_url": "https://robot-042.local/api/v1/media/abc-123?expires=1741003900&sig=4a3f8c..." Size Limits by Transport
| Transport | Max Size (Reference Mode) | Notes |
|---|---|---|
| RCAN-HTTP | 64 MB | HTTPS fetch; robot must serve the media endpoint |
| RCAN-Compact | 512 KB | Media must be fetched via HTTP; ref_url in compact message |
| RCAN-Minimal | Not supported | No media chunks in 32-byte minimal frames |
Robot Media Endpoint
// Required endpoint (v1.6 multi-modal robots):
GET /api/v1/media/{chunk_id}
// Query params:
// expires — Unix timestamp (required when sig present)
// sig — HMAC-SHA256 signature (required)
// Response:
HTTP 200 OK
Content-Type: image/jpeg // mime_type from media_chunks entry
Content-Length: 43520
X-Chunk-Hash: sha256:9a4f8c... // SHA-256 for receiver verification
[binary data]
// Error responses:
HTTP 404 chunk_id not found
HTTP 410 URL expired (expires < now)
HTTP 403 signature invalid CommitmentRecord Extension
Every message with media_chunks MUST extend its CommitmentRecord to include the media hashes. This ensures the audit trail proves exactly what binary data was sent.
// CommitmentRecord v1.6 extension
{
"commitment_id": "uuid",
"msg_id": "uuid",
"timestamp": 1741000000,
"cmd": "sensor_snapshot",
"operator": "user-uuid",
"commitment_hash": "sha256:...", // HMAC over text payload fields
"media_hashes": { // NEW in v1.6 — included in HMAC chain
"abc-123": "sha256:9a4f8c...", // chunk_id → hash_sha256
"def-456": "sha256:1b2c3d..."
},
"chain_hash": "sha256:..." // HMAC over (commitment_hash + media_hashes)
} chain_hash is computed over both the text payload commitment and the media_hashes dict. Omitting media_hashes from the HMAC chain MUST be treated as a commitment integrity violation.
Streaming Mode
For continuous video streams (e.g. a robot camera feed), use SENSOR_DATA (type 7) with streaming fields on the message envelope:
// SENSOR_DATA streaming message (v1.6)
{
"id": "uuid-per-chunk",
"type": 7, // SENSOR_DATA
"payload": {
"sensor": "front_camera",
"stream_id": "uuid-for-this-stream", // Same for all chunks in stream
"chunk_index": 42, // 0-based frame counter
"is_final": false // true on last frame
},
"media_chunks": [
{
"chunk_id": "frame-042-uuid",
"mime_type": "image/jpeg",
"encoding": "base64",
"hash_sha256": "sha256:...",
"size_bytes": 38912,
"data_b64": "..."
}
]
} Stream Lifecycle
- The
stream_idis assigned by the sender at stream start and MUST be the same across all chunks. - Receivers buffer frames using
chunk_indexfor ordering. - The frame with
is_final: truesignals stream end. Receivers MUST NOT expect further frames for thisstream_id. - Dropped frames are logged but do not invalidate the stream — video streams are QoS 0 (fire-and-forget).
- Each frame produces its own CommitmentRecord with its own
media_hashes. The audit trail is per-frame, not per-stream.
TRAINING_DATA — JSON-Only Deprecated
TRAINING_DATA (type 10) message with image, video, or audio data encoded inside the JSON payload field (e.g., as a base64 string within the payload dict) is deprecated. All image/video/audio training data MUST be delivered via media_chunks[].
Why?
- JSON-embedded binary is outside the CommitmentRecord's
media_hashesHMAC chain → audit trail does not prove what data was collected. - EU AI Act Article 10 requires verifiable provenance for training data.
media_chunksprovides this via hash binding. - Chunked media supports the training consent token flow (see §17 Training Data Consent) with per-chunk consent attribution.
v1.6 TRAINING_DATA Required Format
// v1.6 TRAINING_DATA — MUST use media_chunks for binary data
{
"id": "uuid",
"type": 10, // TRAINING_DATA
"payload": {
"dataset_id": "dataset-uuid",
"consent_token": "training-consent-token-uuid", // Required per §17
"data_categories": ["video", "audio"],
"collection_context": "warehouse-sweep-2026-03-16"
// Do NOT embed image/video/audio data here
},
"media_chunks": [ // Required for binary data
{
"chunk_id": "video-frame-001",
"mime_type": "video/mp4",
"encoding": "ref",
"hash_sha256": "sha256:...",
"size_bytes": 2097152,
"ref_url": "https://robot.local/api/v1/media/video-frame-001?...",
"ref_expires": 1741003900
}
]
} Migration Timeline
| Version | JSON-in-payload binary | media_chunks binary |
|---|---|---|
| v1.5 and earlier | Allowed (no standard alternative) | Not available |
| v1.6 | Deprecated — triggers WARNING audit event | ✅ Required for image/video/audio |
| v1.7 (planned) | Rejected — COMMAND_NACK with TRAINING_DATA_NO_MEDIA_CHUNKS | ✅ Required |
rcan-py Implementation
from rcan.multimodal import MediaChunk, encode_inline, encode_ref
from rcan.message import RCANMessage, MessageType
# Inline mode (small image)
with open("snapshot.jpg", "rb") as f:
image_bytes = f.read()
chunk = encode_inline(
data=image_bytes,
mime_type="image/jpeg",
)
# chunk.chunk_id, chunk.hash_sha256 set automatically
msg = RCANMessage(
type=MessageType.SENSOR_DATA,
payload={"sensor": "front_camera"},
media_chunks=[chunk],
...
)
# Reference mode (large video)
chunk_ref = encode_ref(
chunk_id="video-001",
mime_type="video/mp4",
size_bytes=10_000_000,
hash_sha256=precomputed_hash,
signing_key=robot_private_key,
ttl_seconds=300,
base_url="https://robot.local",
)
# chunk_ref.ref_url is signed and ready
# Verify received chunk
from rcan.multimodal import verify_chunk_hash
raw = base64.b64decode(received_chunk.data_b64)
verify_chunk_hash(raw, received_chunk.hash_sha256) # raises MediaHashError on mismatch Implementation Notes
- Hash before base64: Always compute
hash_sha256over the raw bytes, not the base64 string. Receivers must decode base64 before verifying. - ref_url over plaintext HTTP: Robots on isolated local networks MAY serve
ref_urlover plain HTTP (http://robot.local/...) but MUST still include the HMAC signature query parameter. HTTPS is strongly recommended for any network-reachable deployment. - CommitmentRecord ordering:
media_hashesMUST be included in the HMAC chain beforechain_hashis computed. Implementations that computechain_hashwithoutmedia_hashesare non-conformant from v1.6. - Large chunk limits: Reference-mode chunks larger than 64 MB over RCAN-HTTP MUST be split into multiple chunks (max 64 MB each). Use a shared
stream_idand sequentialchunk_index. - RCAN-Compact and media: Over RCAN-Compact, only
encoding: "ref"is supported (ref_url compressed in CBOR). Inline base64 in RCAN-Compact is not permitted.
See Also
- §17 Training Data Consent —
consent_tokenrequired on TRAINING_DATA with media_chunks - §19 Constrained Transport — size limits by encoding tier
- Safety & P66 Conformance — audit trail integrity requirements
- Message Type Reference — SENSOR_DATA (type 7), TRAINING_DATA (type 10)