1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292
|
# Low Level Information for the SRT project
## Introduction
This document contains information on various topics related to
the SRT source code, including descriptions of some cross-source analysis that would
not be obvious for a source code reviewer. It's not a complete documentation of
anything, rather a collection of various kind of information retrieved during
development and even reverse engineering.
## Mutex locking
This analysis is a result of detected lots of cascade mutex locking in the
SRT code. A more detailed analysis would be required as to which mutex is
going to protect what kind of data later.
Here is the info collected so far:
### Data structures
The overall structure of the object database, involving sockets and groups
is as follows (pseudo-language):
```
CUDTUnited (singleton) {
CONTAINER<CUDTSocket> m_Sockets;
CONTAINER<CUDTSocket> m_ClosedSockets;
CONTAINER<CUDTGroup> m_Groups;
CONTAINER<CUDTGroup> m_ClosedGroups;
}
CUDTGroup {
type SocketData { CUDTSocket* ps; SRTSOCKET id; int state; ... }
CONTAINER<SocketData> m_Group;
}
```
Dead sockets (either closed manually or broken after connection) are
moved first from `m_Sockets` to `m_ClosedSockets`. The GC thread will take
care to delete them physically after making sure all inside facilities
do not contain any remaining data of interest.
Groups may only be manually closed, however a closed group is moved
to `m_ClosedGroups`. The GC thread will take care to delete them, as long
as their usage counter is 0. Every call to an API function (as well as
TSBPD thread) increases the usage counter in the beginning and decreases
upon exit. A group may be closed in one thread and still being used in
another. The group will persist up to the time when the current API function
using it exits and decreases the usage counter back to 0.
Containers and contents guarded by mutex:
`CUDTUnited::m_GlobControlLock` - guards all containers in CUDTUnited.
`CUDTSocket::m_ControlLock` - guards internal operation performed on particular
socket, with its existence assumed (this is because a socket will always exist
until it's deleted while being in `m_ClosedSockets`, and when the socket is in
`m_ClosedSockets` it will not be deleted until it's free from any operation,
while the socket is assumed nonexistent for any newly called API function even
if it exists physically, but is moved to `m_ClosedSockets`).
`CUDTGroup::m_GroupLock` - guards the `m_Group` container inside a group that
collects member sockets.
There are unfortunately many situations when multiple locks have to be applied
at a time. This is then the hierarchy of the mutexes that must be preserved
everywhere in the code.
As mutexes cannot be really ordered unanimously, below are two trees, with also
some possible branches inside. The mutex marked with (T) is terminal, that is,
no other locks shall be allowed in the section where this mutex is locked.
### Mutex ordering information
Note that the list isn't exactly complete, but it should contain all
mutexes for which the locking order must be preserved.
```
- CUDTSocket::m_ControlLock
- CUDT::m_ConnectionLock
- CRendezvousQueue::m_RIDVectorLock
- CUDTUnited::m_GlobControlLock
- CUDTGroup::m_GroupLock
- CUDT::m_RecvAckLock || CEPoll::m_EPollLock(T)
----------------
- CUDTUnited::m_GlobControlLock
- CUDTGroup::m_GroupLock || CSndUList::m_ListLock(T)
- CUDT::m_ConnectionLock
- CRendezvousQueue::m_RIDVectorLock
- CUDT::m_SendLock
- CUDT::m_RecvLock
- CUDT::m_RecvBufferLock
- CUDT::m_RecvAckLock || CUDT::m_SendBlockLock
------------------
```
ANALYSIS ON: `m_ConnectionLock`
**NOTE**: There is a potential of reporting a deadlock due to lock
inversion between `CRcvQueue::m_LSLock` and `CUDT::m_ConnectionLock`.
This is false.
According to the below description, you need to know which objects
exactly are taken into account. The CRcvQueue object is an underlying
object of CMultiplexer, which is shared between sockets, and also it
is definitely shared between the listener socket and the accepted
socket spawn out of it. We have then two scenarios here that seem
to be in conflict:
1. Setting a new listener on the multiplexer by calling `srt_listen()`
or removing the listener from the multiplexer by calling `srt_close()`
on the listener socket. This calls `CUDT::setListenState`, which is done under
a lock for `m_ConnectionLock` of the socket that is being set up as listener.
Then inside the CRcvQueue::setListener it locks `m_LSLock`.
2. Processing the new connection from the incoming connection request
processed in `CUDT::processConnectRequest`. It is called under the lock
of `m_LSLock`. Then inside it calls `CUDT::open` and `CUDT::acceptAndRespond`
that lock `m_ConnectionLock`.
However, the methods that potentially lock `m_ConnectionLock` are
called for the sake of the **newly created socket** that is about to
be reported as the accepted one from `srt_accept()`, **not the listener
socket**.
That said, a potential situation when these two things could be in collision, as
theoretically these mutexes are in two different objects and the accepted
socket shares always the listener with the listening socket, then just as well
if `srt_listen()` is called on an accepted socket (which will always fail and
makes no sense, but still it must be somehow recognized and rejected by the
internals), it will go through the sequence of first `m_ConnectionLock` and
then `m_LSLock`, while for the same object the order was that the `m_LSLock`
was locked before its creation in response to the connection handler and
then `m_ConnectionLock` was applied for it.
These two situations, however, cannot occur simultaneously. This is because
those activities that occur during the connection processing that apply
`m_ConnectionLock` under a lock of `m_LSLock` are performed during the time
when the socket is just created and is still under processing for connection.
A potential of calling `srt_listen()`, just as well as `srt_close()` on it,
or even breaking it, will be only possible when this activity is finished
and no more locks are applied on `m_ConnectionLock`.
A theoretical potential exists if you try to call `srt_listen()` on a
potentially accepted socket in the listener callback. The result will
only be such that this function call will hang up due to inability to
lock `m_LSLock`.
```
-- CUDT::startConnect flow
CUDTUnited::connectIn -- > [LOCKED s->m_ControlLock]
CUDT::open -- > [MAYBE_LOCKED m_ConnectionLock, if bind() not called]
CUDT::clearData --> [LOCKED m_StatsLock]
CUDTUnited::updateMux -- > [LOCKED m_GlobControlLock]
{
[SCOPE UNLOCK s->m_ControlLock, if blocking mode]
CUDT::startConnect -- > [LOCKED m_ConnectionLock]
CRcvQueue::registerConnector
CRendezvousQueue::insert --> [LOCKED CRendezvousQueue::m_RIDVectorLock]
}
END.
-- CUDT::groupConnect flow
CUDT::groupConnect (no locks)
CUDT::setOpt [LOCKS m_ConnectionLock, m_SendLock, m_RecvLock]
{ [LOCKS m_GlobControlLock]
CUDTGroup::add [LOCKS m_GroupLock]
}
CUDT::connectIn --> continue with startConnect flow
-- CUDTUnited::listen (API function)
CUDTUnited::listen
CUDTUnited::locateSocket [LOCKS m_GlobControlLock]
{
[SCOPE LOCK s->m_ControlLock]
CUDT::setListenState -- > [LOCKED m_ConnectionLock]
CRcvQueue::setListener -- > [LOCKED m_LSLock]
}
-- CUDT::processAsyncConnectRequest
CRcvQueue::worker ->
...
CRcvQueue::worker_TryAsyncRend_OrStore
CUDT::processAsyncConnectResponse -- > [LOCKED m_ConnectionLock]
CUDT::processConnectResponse
CUDT::postConnect
CUDT::interpretSrtHandshake ->
[IF group extension found]
CUDT::interpretGroup
{
[SCOPE LOCK m_GlobControlLock]
[IF Responder]
{
CUDT::makeMePeerOf
[LOCKS m_GroupLock]
CUDTGroup::syncWithSocket
CUDTGroup::find --> [LOCKED m_GroupLock]
}
debugGroup -- > [LOCKED m_GroupLock]
}
-- CUDT::acceptAndRespond
CRcvQueue::worker_ProcessConnectionRequest
{
[SCOPE LOCK m_LSLock]
CUDT::processConnectRequest
CUDTUnited::newConnection
locateSocket -- > [LOCKED m_GlobControlLock]
locatePeer -- > [LOCKED m_GlobControlLock]
[IF failure, LOCK m_AcceptLock]
generateSocketID --> [LOCKED m_IDLock]
CUDT::open -- > [LOCKED m_ConnectionLock]
CUDT::updateListenerMux -- > [LOCKED m_GlobControlLock]
CUDT::acceptAndRespond --> [LOCKED m_ConnectionLock]
CUDT::interpretSrtHandshake ->
[IF group extension found]
CUDT::interpretGroup
{
[SCOPE LOCK m_GlobControlLock]
[IF Responder]
{
CUDT::makeMePeerOf
[LOCKS m_GroupLock]
CUDTGroup::syncWithSocket
CUDTGroup::find --> [LOCKED m_GroupLock]
}
debugGroup -- > [LOCKED m_GroupLock]
}
{
[SCOPE LOCK m_GlobControlLock]
CUDT::synchronizeWithGroup -- > [LOCKED m_GroupLock]
}
CRcvQueue::setNewEntry -- > [LOCKED CRcvQueue::m_IDLock]
{
[SCOPE LOCK m_GlobControlLock]
{
[SCOPE LOCK m_GroupLock]
}
}
{
[SCOPE LOCK m_AcceptLock]
CEPoll::update_events
}
[IF Rollback]
CUDT::closeInternal
[LOCKING m_EPollLock]
{
[SCOPE LOCK m_ConnectionLock]
[SCOPE LOCK m_SendLock]
[SCOPE LOCK m_RecvLock]
[LOCKING m_RcvBufferLock]
}
{
[SCOPE LOCK m_GlobControlLock]
CUDT::removeFromGroup --> [LOCKED m_GroupLock]
}
CEPoll::update_events
}
-- CUDT::bstats: TRT-LOCKED m_ConnectionLock
-- CUDT::packData
CSndQueue::worker
CSndUList::pop -- > [LOCKED m_ListLock]
CUDT::packData
```
|