§5.3 — Quality of Service v1.5
Earlier RCAN protocol drafts had no delivery guarantee semantics. A SAFETY_ESTOP sent over congested WiFi could be silently dropped. Subsequent revisions add three QoS levels with mandatory ESTOP delivery confirmation.
QoS Levels
| Level | Name | Semantics | Use For |
|---|---|---|---|
0 | fire-and-forget | No ack required. Message may be dropped without notification. | TELEOP streams, STATUS telemetry, HEARTBEAT |
1 | at-least-once | Sender retries until COMMAND_ACK received. May deliver duplicates. | COMMAND, CONFIG, ALERT, most control messages |
2 | exactly-once | Two-phase commit: COMMAND_ACK then COMMAND_COMMIT. No duplicates. | ESTOP (mandatory), CONSENT messages |
Required QoS by Message Type
| Message Type | Min QoS | Notes |
|---|---|---|
| SAFETY (type 6) — ESTOP | 2 | Mandatory; P66 invariant |
| SAFETY (type 6) — RESUME | 1 | At-least-once |
| ALERT (type 13) | 1 | Owner must receive alerts |
| CONFIG (type 5) | 1 | Config must be acknowledged |
| ROBOT_REVOCATION (type 19) | 1 | Broadcast with retry |
| TELEOP (type 14) | 0 | Must be fire-and-forget; stale frames are useless |
| HEARTBEAT (type 4) | 0 | Fire-and-forget |
| STATUS (type 3) | 0 | Best-effort telemetry |
QoS 1: At-Least-Once
Sender Behavior
- Send message with
qos: 1 - Start ACK timer:
ack_timeout_ms(default 500ms) - If COMMAND_ACK not received within timeout: retry
- Retry with exponential backoff: 100ms → 200ms → 400ms
- Max retries:
max_retries(default 3) - After max retries: declare delivery failure
Receiver Behavior
- Process the message
- Send COMMAND_ACK (type 17) within
ack_timeout_ms(500ms) - Use replay cache to deduplicate retries (same msg_id)
Safety Failure Mode
QoS 2: Exactly-Once (Two-Phase Commit)
Used for ESTOP and other messages where duplicate processing must be prevented.
Sender Receiver | | |-- COMMAND (qos:2, msg_id:X) ------>| | |-- Process message immediately | | (ESTOP: halt now, don't wait for COMMIT) |<-- COMMAND_ACK (reply_to:X) -------| | | |-- COMMAND_COMMIT (reply_to:X) ---->| | |-- Commit; remove from exactly-once cache |<-- RESPONSE (reply_to:commit_id) --| | |
ESTOP-Specific Behavior
- ESTOP message received with
qos: 2 - Robot halts immediately
- Robot sends COMMAND_ACK within 500ms
- Sender receives COMMAND_ACK — ESTOP confirmed delivered
- Sender sends COMMAND_COMMIT
- Robot removes ESTOP from exactly-once cache
If the sender does not receive COMMAND_ACK: retry ESTOP. The robot's replay-prevention and exactly-once cache ensures the second ESTOP is a no-op (already halted) but the sender gets its ACK confirmation.
COMMAND_NACK (Type 31)
When a receiver cannot process a message and must inform the sender, it sends COMMAND_NACK:
{
"id": "uuid-v4",
"type": 31,
"reply_to": "original-msg-id",
"payload": {
"reason": "REPLAY_DETECTED",
"detail": "msg_id abc123 was already processed"
}
} COMMAND_NACK is fire-and-forget (qos: 0). Senders receiving NACK MUST NOT retry the original message.
Sequence Numbers for Exactly-Once
For exactly-once delivery, receivers maintain an exactly-once cache keyed on msg_id. The cache entry has three states:
PENDING— ACK sent, waiting for COMMITCOMMITTED— COMMIT received; entry retained forreplay_window_s- Evicted after
replay_window_selapses
If a message arrives with a msg_id in state PENDING or COMMITTED, resend the ACK and return without reprocessing.
Configuration
qos:
ack_timeout_ms: 500 # Time to wait for COMMAND_ACK before retry
max_retries: 3 # Max retry attempts (QoS 1 and 2)
backoff_base_ms: 100 # Initial backoff; doubles each retry See Also
- Replay Attack Prevention — msg_id cache interaction with QoS dedup
- Safety Conformance — ESTOP invariants
- COMMAND_ACK (17), COMMAND_COMMIT (18), COMMAND_NACK (31)