1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452
|
=====================================
Server Discovery And Monitoring Tests
=====================================
.. contents::
----
The YAML and JSON files in this directory tree are platform-independent tests
that drivers can use to prove their conformance to the
Server Discovery And Monitoring Spec.
Additional prose tests, that cannot be represented as spec tests, are
described and MUST be implemented.
Version
-------
Files in the "specifications" repository have no version scheme. They are not
tied to a MongoDB server version.
Format
------
Each YAML file has the following keys:
- description: A textual description of the test.
- uri: A connection string.
- phases: An array of "phase" objects.
A phase of the test optionally sends inputs to the client,
then tests the client's resulting TopologyDescription.
Each phase object has the following keys:
- description: (optional) A textual description of this phase.
- responses: (optional) An array of "response" objects. If not provided,
the test runner should construct the client and perform assertions specified
in the outcome object without processing any responses.
- applicationErrors: (optional) An array of "applicationError" objects.
- outcome: An "outcome" object representing the TopologyDescription.
A response is a pair of values:
- The source, for example "a:27017".
This is the address the client sent the "hello" or legacy hello command to.
- A hello or legacy hello response, for example ``{ok: 1, helloOk: true, isWritablePrimary: true}``.
If the response includes an electionId it is shown in extended JSON like
``{"$oid": "000000000000000000000002"}``.
The empty response `{}` indicates a network error
when attempting to call "hello" or legacy hello.
An "applicationError" object has the following keys:
- address: The source address, for example "a:27017".
- generation: (optional) The error's generation number, for example ``1``.
When absent this value defaults to the pool's current generation number.
- maxWireVersion: The ``maxWireVersion`` of the connection the error occurs
on, for example ``9``. Added to support testing the behavior of "not writable primary"
errors on <4.2 and >=4.2 servers.
- when: A string describing when this mock error should occur. Supported
values are:
- "beforeHandshakeCompletes": Simulate this mock error as if it occurred
during a new connection's handshake for an application operation.
- "afterHandshakeCompletes": Simulate this mock error as if it occurred
on an established connection for an application operation (i.e. after
the connection pool check out succeeds).
- type: The type of error to mock. Supported values are:
- "command": A command error. Always accompanied with a "response".
- "network": A non-timeout network error.
- "timeout": A network timeout error.
- response: (optional) A command error response, for example
``{ok: 0, errmsg: "not primary"}``. Present if and only if ``type`` is
"command". Note the server only returns "not primary" if the "hello" command
has been run on this connection. Otherwise the legacy error message is returned.
In non-monitoring tests, an "outcome" represents the correct
TopologyDescription that results from processing the responses in the phases
so far. It has the following keys:
- topologyType: A string like "ReplicaSetNoPrimary".
- setName: A string with the expected replica set name, or null.
- servers: An object whose keys are addresses like "a:27017", and whose values
are "server" objects.
- logicalSessionTimeoutMinutes: null or an integer.
- maxSetVersion: absent or an integer.
- maxElectionId: absent or a BSON ObjectId.
- compatible: absent or a bool.
A "server" object represents a correct ServerDescription within the client's
current TopologyDescription. It has the following keys:
- type: A ServerType name, like "RSSecondary".
- setName: A string with the expected replica set name, or null.
- setVersion: absent or an integer.
- electionId: absent, null, or an ObjectId.
- logicalSessionTimeoutMinutes: absent, null, or an integer.
- minWireVersion: absent or an integer.
- maxWireVersion: absent or an integer.
- topologyVersion: absent, null, or a topologyVersion document.
- pool: (optional) A "pool" object.
A "pool" object represents a correct connection pool for a given server.
It has the following keys:
- generation: This server's expected pool generation, like ``0``.
In monitoring tests, an "outcome" contains a list of SDAM events that should
have been published by the client as a result of processing hello or legacy hello
responses in the current phase. Any SDAM events published by the client during its
construction (that is, prior to processing any of the responses) should be
combined with the events published during processing of hello or legacy hello
responses of the first phase of the test. A test MAY explicitly verify events
published during client construction by providing an empty responses array for the
first phase.
Use as unittests
----------------
Mocking
~~~~~~~
Drivers should be able to test their server discovery and monitoring logic without
any network I/O, by parsing hello (or legacy hello) and application error from the
test file and passing them into the driver code. Parts of the client and
monitoring code may need to be mocked or subclassed to achieve this.
`A reference implementation for PyMongo 3.10.1 is available here
<https://github.com/mongodb/mongo-python-driver/blob/3.10.1/test/test_discovery_and_monitoring.py>`_.
Initialization
~~~~~~~~~~~~~~
For each file, create a fresh client object initialized with the file's "uri".
All files in the "single" directory include a connection string with one host
and no "replicaSet" option.
Set the client's initial TopologyType to Single, however that is achieved using the client's API.
(The spec says "The user MUST be able to set the initial TopologyType to Single"
without specifying how.)
All files in the "sharded" directory include a connection string with multiple hosts
and no "replicaSet" option.
Set the client's initial TopologyType to Unknown or Sharded, depending on the client's API.
All files in the "rs" directory include a connection string with a "replicaSet" option.
Set the client's initial TopologyType to ReplicaSetNoPrimary.
(For most clients, parsing a connection string with a "replicaSet" option
automatically sets the TopologyType to ReplicaSetNoPrimary.)
Set up a listener to collect SDAM events published by the client, including
events published during client construction.
Test Phases
~~~~~~~~~~~
For each phase in the file:
#. Parse the "responses" array. Pass in the responses in order to the driver
code. If a response is the empty object ``{}``, simulate a network error.
#. Parse the "applicationErrors" array. For each element, simulate the given
error as if it occurred while running an application operation. Note that
it is sufficient to construct a mock error and call the procedure which
updates the topology, e.g.
``topology.handleApplicationError(address, generation, maxWireVersion, error)``.
For non-monitoring tests,
once all responses are processed, assert that the phase's "outcome" object
is equivalent to the driver's current TopologyDescription.
For monitoring tests, once all responses are processed, assert that the
events collected so far by the SDAM event listener are equivalent to the
events specified in the phase.
Some fields such as "logicalSessionTimeoutMinutes", "compatible", and
"topologyVersion" were added later and haven't been added to all test files.
If these fields are present, test that they are equivalent to the fields of
the driver's current TopologyDescription or ServerDescription.
For monitoring tests, clear the list of events collected so far.
Continue until all phases have been executed.
Integration Tests
-----------------
Integration tests are provided in the "integration" directory.
Test Format
~~~~~~~~~~~
The same as the `Transactions Spec Test format
</source/transactions/tests/README.rst#test-format>`_ with the following
additions:
- The ``runOn`` requirement gains a new field:
- ``authEnabled`` (optional): If True, skip this test if auth is not enabled.
If False, skip this test if auth is enabled. If this field is omitted,
this test can be run on clusters with or without auth.
Special Test Operations
~~~~~~~~~~~~~~~~~~~~~~~
Certain operations that appear in the "operations" array do not correspond to
API methods but instead represent special test operations. Such operations are
defined on the "testRunner" object and are documented in the
`Transactions Spec Test
</source/transactions/tests/README.rst#special-test-operations>`_.
Additional, SDAM test specific operations are documented here:
configureFailPoint
''''''''''''''''''
The "configureFailPoint" operation instructs the test runner to configure
the given server failpoint on the "admin" database. The runner MUST disable
this failpoint at the end of the test. For example::
- name: configureFailPoint
object: testRunner
arguments:
failPoint:
configureFailPoint: failCommand
mode: { times: 1 }
data:
failCommands: ["insert"]
closeConnection: true
Tests that use the "configureFailPoint" operation do not include
``configureFailPoint`` commands in their command expectations. Drivers MUST
ensure that ``configureFailPoint`` commands do not appear in the list of logged
commands, either by manually filtering it from the list of observed commands or
by using a different MongoClient to execute ``configureFailPoint``.
Note, similar to the ``tests.failPoint`` field described in the `Transactions
Spec Test format </source/transactions/tests/README.rst#test-format>`_ tests
with ``useMultipleMongoses: true`` will not contain a ``configureFailPoint``
operation.
wait
''''
The "wait" operation instructs the test runner to sleep for "ms"
milliseconds. For example::
- name: wait
object: testRunner
arguments:
ms: 1000
waitForEvent
''''''''''''
The "waitForEvent" operation instructs the test runner to wait until the test's
MongoClient has published a specific event a given number of times. For
example, the following instructs the test runner to wait for at least one
PoolClearedEvent to be published::
- name: waitForEvent
object: testRunner
arguments:
event: PoolClearedEvent
count: 1
Note that "count" includes events that were published while running previous
operations.
If the "waitForEvent" operation is not satisfied after 10 seconds, the
operation is considered an error.
ServerMarkedUnknownEvent
````````````````````````
The ServerMarkedUnknownEvent may appear as an event in `waitForEvent`_ and
`assertEventCount`_. This event is defined as ServerDescriptionChangedEvent
where newDescription.type is ``Unknown``.
assertEventCount
''''''''''''''''
The "assertEventCount" operation instructs the test runner to assert the test's
MongoClient has published a specific event a given number of times. For
example, the following instructs the test runner to assert that a single
PoolClearedEvent was published::
- name: assertEventCount
object: testRunner
arguments:
event: PoolClearedEvent
count: 1
recordPrimary
'''''''''''''
The "recordPrimary" operation instructs the test runner to record the current
primary of the test's MongoClient. For example::
- name: recordPrimary
object: testRunner
runAdminCommand
'''''''''''''''
The "runAdminCommand" operation instructs the test runner to run the given
command on the admin database. Drivers MUST run this command on a different
MongoClient from the one used for test operations. For example::
- name: runAdminCommand
object: testRunner
command_name: replSetFreeze
arguments:
command:
replSetFreeze: 0
readPreference:
mode: Secondary
waitForPrimaryChange
''''''''''''''''''''
The "waitForPrimaryChange" operation instructs the test runner to wait up to
"timeoutMS" milliseconds for the MongoClient to discover a new primary server.
The new primary should be different from the one recorded by "recordPrimary".
For example::
- name: waitForPrimaryChange
object: testRunner
arguments:
timeoutMS: 15000
To implement, Drivers can subscribe to ServerDescriptionChangedEvents and wait
for an event where newDescription.type is ``RSPrimary`` and the address is
different from the one previously recorded by "recordPrimary".
startThread
'''''''''''
The "startThread" operation instructs the test runner to start a new thread
with the provided "name". The `runOnThread`_ and `waitForThread`_ operations
reference a thread by its "name". For example::
- name: startThread
object: testRunner
arguments:
name: thread1
runOnThread
'''''''''''
The "runOnThread" operation instructs the test runner to schedule an operation
to be run on the given thread. runOnThread MUST NOT wait for the scheduled
operation to complete. For example::
- name: runOnThread
object: testRunner
arguments:
name: thread1
operation:
name: insertOne
object: collection
arguments:
document:
_id: 2
error: true
waitForThread
'''''''''''''
The "waitForThread" operation instructs the test runner to stop the given
thread, wait for it to complete, and assert that the thread exited without
any errors. For example::
- name: waitForThread
object: testRunner
arguments:
name: thread1
Prose Tests
-----------
The following prose tests cannot be represented as spec tests and MUST be
implemented.
Streaming protocol Tests
~~~~~~~~~~~~~~~~~~~~~~~~
Drivers that implement the streaming protocol (multi-threaded or
asynchronous drivers) must implement the following tests. Each test should be
run against a standalone, replica set, and sharded cluster unless otherwise
noted.
Some of these cases should already be tested with the old protocol; in
that case just verify the test cases succeed with the new protocol.
1. Configure the client with heartbeatFrequencyMS set to 500,
overriding the default of 10000. Assert the client processes
hello and legacy hello replies more frequently (approximately every 500ms).
RTT Tests
~~~~~~~~~
Run the following test(s) on MongoDB 4.4+.
1. Test that RTT is continuously updated.
#. Create a client with ``heartbeatFrequencyMS=500``,
``appName=streamingRttTest``, and subscribe to server events.
#. Run a find command to wait for the server to be discovered.
#. Sleep for 2 seconds. This must be long enough for multiple heartbeats
to succeed.
#. Assert that each ``ServerDescriptionChangedEvent`` includes a non-zero
RTT.
#. Configure the following failpoint to block hello or legacy hello commands
for 250ms which should add extra latency to each RTT check::
db.adminCommand({
configureFailPoint: "failCommand",
mode: {times: 1000},
data: {
failCommands: ["hello"], // or the legacy hello command
blockConnection: true,
blockTimeMS: 500,
appName: "streamingRttTest",
},
});
#. Wait for the server's RTT to exceed 250ms. Eventually the average RTT
should also exceed 500ms but we use 250ms to speed up the test. Note
that the `Server Description Equality`_ rule means that
ServerDescriptionChangedEvents will not be published. This test may
need to use a driver specific helper to obtain the latest RTT instead.
If the RTT does not exceed 250ms after 10 seconds, consider the test
failed.
#. Disable the failpoint::
db.adminCommand({
configureFailPoint: "failCommand",
mode: "off",
});
.. Section for links.
.. _Server Description Equality: /source/server-discovery-and-monitoring/server-discovery-and-monitoring.rst#server-description-equality
|