1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159
|
# Snapcast binary protocol
Each message sent with the Snapcast binary protocol is split up into two parts:
- A base message that provides general information like time sent/received, type of the message, message size, etc
- A typed message that carries the rest of the information
The protocol is using little endian.
## Client joining process
When a client joins a server, the following exchanges happen
1. Client opens a TCP socket to the server (default port is 1704)
1. Client sends a [Hello](#hello) message
1. Server sends a [Server Settings](#server-settings) message
1. Server sends a [Codec Header](#codec-header) message
1. Until the server sends this, the client shouldn't play any [Wire Chunk](#wire-chunk) messages
1. The server will now send [Wire Chunk](#wire-chunk) messages, which can be fed to the audio decoder.
1. When it comes time for the client to disconnect, the socket can just be closed.
1. Client periodically sends a [Time](#time) message, carrying a sent timestamp `t_client-sent`
1. Receives a Time response containing the client to server time delta `latency_c2s = t_server-recv - t_client-sent + t_network-latency` and the server sent timestamp `t_server-sent`
1. Calculates `latency_s2c = t_client-recv - t_server-sent + t_network_latency`
1. Calcutates the time diff between server and client as `(latency_c2s - latency_s2c) / 2`, eliminating the network latency (assumed to be symmetric)
## Messages
| Typed Message ID | Name | Dir | Notes |
|------------------|--------------------------------------|------|---------------------------------------------------------------------------|
| 0 | [Base](#base) | | The beginning of every message containing data about the typed message |
| 1 | [Codec Header](#codec-header) | S->C | The codec-specific data to put at the start of a stream to allow decoding |
| 2 | [Wire Chunk](#wire-chunk) | S->C | A part of an audio stream |
| 3 | [Server Settings](#server-settings) | S->C | Settings set from the server like volume, latency, etc |
| 4 | [Time](#time) | C->S<br>S->C | Used for synchronizing time with the server |
| 5 | [Hello](#hello) | C->S | Sent by the client when connecting with the server |
| 7 | [Client Info](#client-info) | C->S | Update the server when relevant information changes (e.g. client volume) |
| 8 | [Error](#error) | S->C | Error response, used e.g. for missing authentication |
### Base
| Field | Type | Description |
|-----------------------|--------|---------------------------------------------------------------------------------------------------|
| type | uint16 | Should be one of the typed message IDs |
| id | uint16 | Used in requests to identify the message (not always used) |
| refersTo | uint16 | Used in responses to identify which request message ID this is responding to |
| sent.sec | int32 | The second value of the timestamp when this message was sent. Filled in by the sender. |
| sent.usec | int32 | The microsecond value of the timestamp when this message was sent. Filled in by the sender. |
| received.sec | int32 | The second value of the timestamp when this message was received. Filled in by the receiver. |
| received.usec | int32 | The microsecond value of the timestamp when this message was received. Filled in by the receiver. |
| size | uint32 | Total number of bytes of the following typed message |
### Codec Header
| Field | Type | Description |
|------------|---------|-------------------------------------------------------------|
| codec_size | unint32 | Length of the codec string (not including a null character) |
| codec | char[] | String describing the codec (not null terminated) |
| size | uint32 | Size of the following payload |
| payload | char[] | Buffer of data containing the codec header |
The payload depends on the used codec:
- Flac: the FLAC audio file header, as described [here](https://www.the-roberts-family.net/metadata/flac.html#:~:text=Overall%20Structure&text=It%20has%20four%20parts%3A%20a,and%20the%20actual%20audio%20data.). The decoder must be initialized with this header.
- Ogg: the vorbis stream header, as described [here](https://xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-610004.2). The decoder must be initialized with this header.
- PCM: a RIFF WAVE header, as described [here](https://de.wikipedia.org/wiki/RIFF_WAVE). PCM is not encoded, but the decoder must know the samplerate, bit depth and number of channels, which is encoded into the header
- Opus: a dummy header is sent, containing a 4 byte ID (0x4F505553, ascii for "OPUS"), 4 byte samplerate, 2 byte bit depth, 2 byte channel count (all little endian)
### Wire Chunk
| Field | Type | Description |
|----------------|---------|---------------------------------------------------------------------------------------|
| timestamp.sec | int32 | The second value of the timestamp when this part of the stream was recorded |
| timestamp.usec | int32 | The microsecond value of the timestamp when this part of the stream was recorded |
| size | uint32 | Size of the following payload |
| payload | char[] | Buffer of data containing the encoded PCM data (a decodable chunk per message) |
### Server Settings
| Field | Type | Description |
|---------|--------|----------------------------------------------------------|
| size | uint32 | Size of the following JSON string |
| payload | char[] | JSON string containing the message (not null terminated) |
Sample JSON payload (whitespace added for readability):
```json
{
"bufferMs": 1000,
"latency": 0,
"muted": false,
"volume": 100
}
```
- `volume` can have a value between 0-100 inclusive
### Time
| Field | Type | Description |
|----------------|---------|------------------------------------------------------------------------|
| latency.sec | int32 | The second value of the latency between the server and the client |
| latency.usec | int32 | The microsecond value of the latency between the server and the client |
### Hello
| Field | Type | Description |
|---------|--------|----------------------------------------------------------|
| size | uint32 | Size of the following JSON string |
| payload | char[] | JSON string containing the message (not null terminated) |
Sample JSON payload (whitespace added for readability):
```json
{
"Arch": "x86_64",
"Auth": {
"param": "YmFkYWl4OnBhc3N3ZA==",
"scheme": "Basic"
},
"ClientName": "Snapclient",
"HostName": "my_hostname",
"ID": "00:11:22:33:44:55",
"Instance": 1,
"MAC": "00:11:22:33:44:55",
"OS": "Arch Linux",
"SnapStreamProtocolVersion": 2,
"Version": "0.32.0"
}
```
The field `Auth` is optional and only used if authentication and authorization is enabled on the server.
### Client Info
| Field | Type | Description |
|---------|--------|----------------------------------------------------------|
| size | uint32 | Size of the following JSON string |
| payload | char[] | JSON string containing the message (not null terminated) |
Sample JSON payload (whitespace added for readability):
```json
{
"volume": 100,
"muted": false,
}
```
- `volume` can have a value between 0-100 inclusive
### Error
| Field | Type | Description |
|---------|--------|----------------------------------------------------------|
| code | uint32 | Error code |
| size | uint32 | Size of the following error string |
| error | char[] | string containing the error (not null terminated) |
| size | uint32 | Size of the following error message |
| error | char[] | string containing error details (not null terminated) |
|