Developer protocol
Receive room audio with plain HTTP.
Hummic records meeting audio locally and delivers it to a user-owned endpoint. File mode uploads complete recordings over HTTP multipart; Agent mode streams live audio over WebSocket. The resumable upload path is designed around tus.
Current firmware
File mode: one recording, one upload.
Implement one endpoint that accepts multipart/form-data. If the response status is 2xx, the device marks the recording as uploaded. Authentication can be a bearer token configured in the setup portal.
- Method
POST- Body
multipart/form-data- Audio
- Ogg Opus, 16 kHz, 2 channels
- Auth
Authorization: Bearer <token>- Retry
- 2xx succeeds, 401/403 fails, 408/429/5xx retries
Status
What ships today, what is planned.
- File upload
- Shipping Current firmware contract (P0). The Quickstart below runs against it.
- Resumable (tus)
- Planned P1 direction for long recordings and unstable WiFi.
- Agent mode (WS)
- Preview Designed, not in the current firmware.
File mode contract
Minimum compatible receiver.
POST /upload
Authorization: Bearer hummic-dev-token
X-Hummic-Device-Id: meeting-room-01
X-Hummic-Recording-Id: R00000001
Content-Type: multipart/form-data
file=@R00000001.ogg
device_id=meeting-room-01
recording_id=R00000001
started_at=boot+12s
ended_at=boot+42s
format=opus
sample_rate=16000
channels=2
segment_index=0
segment_count=1X-Hummic-Device-Id.
X-Hummic-Recording-Id.
opus for new firmware.
16000.
2.
0.
1.
Agent mode contract
Stream live audio over WebSocket.
Preview Agent mode is designed but is not in the current firmware — File mode above is what ships today, and the frames below are the target contract. Agent mode opens a single WebSocket and exchanges JSON control frames plus binary audio frames within one session. The device sends audio and button events; the agent runtime returns short text replies shown on the device screen. Audio frames are raw binary; all other messages are JSON with a type discriminator.
GET /agent/live
Upgrade: websocket
Authorization: Bearer hummic-dev-token
X-Hummic-Device-Id: meeting-room-01
# 1. device opens session
client -> {"type":"session_start","device_id":"meeting-room-01",
"format":"pcm_s16le","sample_rate":16000,"channels":1}
agent -> {"type":"session_state","state":"ready","session_id":"s-8f1c"}
# 2. device streams audio (binary frames) + control
client => <binary audio_chunk: 20ms PCM>
client -> {"type":"button_event","action":"press"}
# 3. agent replies (shown on screen)
agent -> {"type":"text_reply","text":"Logged the action item.","final":true}
agent -> {"type":"display_hint","line1":"Listening","line2":"Room 01"}
# 4. close
client -> {"type":"session_end"}
agent -> {"type":"session_state","state":"closed"}format, sample_rate, channels.
press / long_press / release.
final:false for streamed partials.
ready / busy / closed (+ session_id).
session_state: closed.
Quickstart
Receive your first recording in five minutes.
# 1. get the zero-dependency reference receiver (stdlib only, Python 3.8+)
curl -O https://hummic.ai/examples/upload-server/server.py
# 2. run it (pick any token — this becomes the device's bearer token)
python3 server.py --host 0.0.0.0 --port 8789 --token hummic-dev-token
# 3. confirm the endpoint is alive
curl -fsS http://127.0.0.1:8789/health
# -> {"ok":true}Now prove the upload contract end to end with a throwaway file — the device sends exactly this request:
# make a test file (any bytes work — the receiver does not validate audio)
head -c 65536 /dev/urandom > R00000001.ogg
# or a real Opus file: ffmpeg -f lavfi -i sine=frequency=440:duration=5 -c:a libopus R00000001.ogg
curl --fail --max-time 30 --retry 3 --retry-delay 2 \
-H "Authorization: Bearer hummic-dev-token" \
-H "X-Hummic-Device-Id: meeting-room-01" \
-H "X-Hummic-Recording-Id: R00000001" \
-F "file=@R00000001.ogg" \
-F "device_id=meeting-room-01" \
-F "recording_id=R00000001" \
-F "format=opus" -F "sample_rate=16000" -F "channels=2" \
http://127.0.0.1:8789/upload
# -> {"ok":true,"upload_id":"R00000001"}
On the device, set Upload URL to http://<computer-ip>:8789/upload, Authorization Token to hummic-dev-token, and Device Name to something stable like meeting-room-01. The --max-time and --retry flags mirror the device's own timeout and backoff.
Prefer to generate from the contract? openapi.yaml imports into Postman, or scaffold a client: npx @openapitools/openapi-generator-cli generate -i https://hummic.ai/openapi.yaml -g python
Errors & retry
Status codes, response bodies, backoff.
The device classifies the HTTP status into three outcomes: success, permanent failure, and retryable. Return a JSON body so failures are diagnosable, but the device only requires the status code.
| Status | Outcome | Device behavior |
|---|---|---|
| 200 / 201 / 204 | Success | Recording marked uploaded; removed from queue. |
| 400 / 422 | Permanent failure | Marked failed; not retried. Fix payload/endpoint. |
| 401 / 403 | Permanent failure | Marked failed; not retried. Check bearer token. |
| 408 / 429 | Retryable | Backoff and retry; honors Retry-After if present. |
| 500 / 502 / 503 / 504 | Retryable | Backoff and retry until queue limit reached. |
| timeout / no route | Retryable | Cache locally; retry when WiFi recovers. |
# success response body (optional, ignored except status)
HTTP/1.1 200 OK
Content-Type: application/json
{"ok": true, "recording_id": "R00000001", "stored_as": "s3://bucket/R00000001.ogg"}
# failure response body (recommended for diagnosis)
HTTP/1.1 422 Unprocessable Entity
Content-Type: application/json
{"ok": false, "error": "missing_field", "field": "recording_id"}
Backoff: retryable failures use exponential backoff starting at 2s, doubling to a 60s cap, with jitter. A Retry-After header (seconds or HTTP-date) overrides the computed delay. Recordings stay cached on device until a 2xx or a permanent failure clears them.
Standards direction
Use the boring standard first. Upgrade when uploads need it.
P0 file upload
multipart/form-data is the current firmware contract and is specified by RFC 7578. It is the simplest way to accept a file plus metadata in every common web framework.
P1 resumable upload
tus is the preferred open protocol for resumable uploads over HTTP. It is the natural next step for long recordings, unstable WiFi, and large files.
Object storage
S3-compatible multipart upload or presigned URLs are useful when recordings should go directly into object storage instead of through an app server.
Live agents
WebSocket carries the current Agent mode; WebRTC is reserved for lower-latency live audio. Neither is needed for the first reliable file-upload path.