1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734
|
[](https://github.com/sergv/atomic-counter/actions/workflows/haskell-ci.yaml)
# Synopsis
Mutable cells that hold an integer value and can be safely modified from
multiple threads. Support only few operations: read, write, +, -, and
bitwise and, or, xor and nand.
Good use case is a shared counter that multiple threads increment.
Operations translate to atomic CPU instructions which involve memory
barrier. For limited set of operation this package provides in
concurrent setting they tend to be faster than atomic modify operation
on `IORef`s and other shared var types. Value inside the `Counter`
type is unboxed which contributes to outperforming all other shared
vars because they box the value they store. For integers boxing is
especially bad.
For single-threaded use case this package will likely not outperform
vanilla `IORef`s or `STRef`s (it depends on whether GHC adds memory
barrier for them or not, cf
https://gitlab.haskell.org/ghc/ghc/-/issues/22764, if it does end up
adding the same memory barrier then both this package and `IORef`
should be equal). All in all, operations this package provides will
incur guaranteed memory barrier which will serve no purpose in
single-threaded settings so for that use case this package is probably
not the best. The intended use of this package is in the concurrent
setting where it does seem like a clear winner (please see benchmarks
below).
Still even is single-threaded scenario the `Counter` can serve as an
efficient mutable integer cell that does not box the integer.
# Benchmark
### Summary
Depending on number of threads `Counter` from this package can be
up to 10 times faster than next best `TVar` from the `stm` package.
### Details
The benchmark is to spawn N threads each of which will increment the
same counter by 1 for a number of iterations.
Test setup: Intel i5-9600K CPU @ 3.70GHz, 6 cores, no hyperthreading, GHC 9.6.1.
NB `IORef inconsistent` is the benchmark that just reads and writes
`IORef` with the fastest functions the `IORef` supports. It gives
wrong results when multiple threads access `IORef` this way since
these reads and writes are not synchronized. It’s included only for
speed comparison purposes. All others do proper synchronization of
increments between the threads. Current package is shown as `Counter`.
```
$ cabal run bench -- -j1 --timeout 300 --stdev 2
All
Correctness: OK (13.97s)
+++ OK, passed 10000 tests.
Read/write contention with 10 iterations and 1 threads
Counter: OK (14.23s)
839 ns ± 33 ns
IORef inconsistent: OK (1.08s)
1.03 μs ± 36 ns, 1.23x
IORef atomic: OK (1.30s)
1.25 μs ± 40 ns, 1.48x
MVar: OK (10.53s)
1.24 μs ± 13 ns, 1.48x
TMVar: OK (0.65s)
1.18 μs ± 30 ns, 1.41x
TVar: OK (0.33s)
1.13 μs ± 45 ns, 1.35x
Addr: OK (3.37s)
818 ns ± 28 ns, 0.97x
Read/write contention with 100 iterations and 1 threads
Counter: OK (1.47s)
1.35 μs ± 24 ns
IORef inconsistent: OK (17.27s)
2.09 μs ± 47 ns, 1.55x
IORef atomic: OK (0.45s)
3.22 μs ± 122 ns, 2.38x
MVar: OK (3.15s)
2.97 μs ± 30 ns, 2.20x
TMVar: OK (1.24s)
4.65 μs ± 64 ns, 3.44x
TVar: OK (2.43s)
4.61 μs ± 158 ns, 3.41x
Addr: OK (1.16s)
1.09 μs ± 22 ns, 0.80x
Read/write contention with 1000 iterations and 1 threads
Counter: OK (0.47s)
6.48 μs ± 235 ns
IORef inconsistent: OK (0.78s)
2.76 μs ± 88 ns, 0.43x
IORef atomic: OK (0.60s)
17.5 μs ± 399 ns, 2.69x
MVar: OK (2.09s)
15.8 μs ± 208 ns, 2.43x
TMVar: OK (0.40s)
48.3 μs ± 1.9 μs, 7.46x
TVar: OK (0.55s)
32.4 μs ± 713 ns, 5.00x
Addr: OK (0.34s)
4.73 μs ± 189 ns, 0.73x
Read/write contention with 10000 iterations and 1 threads
Counter: OK (0.50s)
58.4 μs ± 2.1 μs
IORef inconsistent: OK (0.42s)
23.8 μs ± 829 ns, 0.41x
IORef atomic: OK (0.73s)
168 μs ± 3.8 μs, 2.87x
MVar: OK (0.66s)
157 μs ± 4.6 μs, 2.69x
TMVar: OK (2.09s)
512 μs ± 7.2 μs, 8.78x
TVar: OK (1.39s)
321 μs ± 6.7 μs, 5.50x
Addr: OK (47.12s)
43.8 μs ± 562 ns, 0.75x
Read/write contention with 10 iterations and 2 threads
Counter: OK (1.10s)
7.85 μs ± 144 ns
IORef inconsistent: OK (1.93s)
7.65 μs ± 201 ns, 0.97x
IORef atomic: OK (0.54s)
8.16 μs ± 235 ns, 1.04x
MVar: OK (7.81s)
7.81 μs ± 72 ns, 0.99x
TMVar: OK (0.49s)
7.98 μs ± 292 ns, 1.02x
TVar: OK (1.91s)
7.98 μs ± 181 ns, 1.02x
Addr: OK (2.01s)
7.64 μs ± 213 ns, 0.97x
Read/write contention with 100 iterations and 2 threads
Counter: OK (1.00s)
8.50 μs ± 290 ns
IORef inconsistent: OK (8.00s)
8.13 μs ± 133 ns, 0.96x
IORef atomic: OK (2.80s)
13.3 μs ± 358 ns, 1.56x
MVar: OK (4.64s)
19.2 μs ± 473 ns, 2.25x
TMVar: OK (5.52s)
32.3 μs ± 1.1 μs, 3.80x
TVar: OK (83.27s)
30.5 μs ± 279 ns, 3.59x
Addr: OK (4.38s)
8.71 μs ± 333 ns, 1.03x
Read/write contention with 1000 iterations and 2 threads
Counter: OK (28.57s)
43.7 μs ± 106 ns
IORef inconsistent: OK (0.91s)
17.2 μs ± 669 ns, 0.39x
IORef atomic: OK (130.48s)
223 μs ± 10 μs, 5.09x
MVar: OK (1.19s)
5.11 ms ± 173 μs, 116.88x
TMVar: OK (0.33s)
295 μs ± 7.7 μs, 6.75x
TVar: OK (1.44s)
337 μs ± 10 μs, 7.71x
Addr: OK (0.47s)
49.3 μs ± 915 ns, 1.13x
Read/write contention with 10000 iterations and 2 threads
Counter: OK (0.51s)
466 μs ± 17 μs
IORef inconsistent: OK (13.49s)
190 μs ± 1.2 μs, 0.41x
IORef atomic: OK (71.77s)
1.80 ms ± 33 μs, 3.86x
MVar: OK (7.23s)
60.1 ms ± 505 μs, 128.83x
TMVar: OK (1.50s)
2.39 ms ± 39 μs, 5.13x
TVar: OK (108.24s)
3.13 ms ± 405 μs, 6.70x
Addr: OK (0.29s)
531 μs ± 20 μs, 1.14x
Read/write contention with 10 iterations and 4 threads
Counter: OK (2.33s)
15.5 μs ± 384 ns
IORef inconsistent: OK (2.41s)
15.6 μs ± 52 ns, 1.01x
IORef atomic: OK (1.56s)
19.5 μs ± 487 ns, 1.26x
MVar: OK (1.51s)
28.9 μs ± 647 ns, 1.87x
TMVar: OK (2.88s)
19.9 μs ± 439 ns, 1.29x
TVar: OK (0.38s)
18.2 μs ± 332 ns, 1.18x
Addr: OK (0.59s)
15.3 μs ± 477 ns, 0.99x
Read/write contention with 100 iterations and 4 threads
Counter: OK (1.63s)
25.3 μs ± 723 ns
IORef inconsistent: OK (0.35s)
17.6 μs ± 451 ns, 0.69x
IORef atomic: OK (0.46s)
154 μs ± 2.1 μs, 6.08x
MVar: OK (0.62s)
1.20 ms ± 38 μs, 47.27x
TMVar: OK (0.12s)
191 μs ± 7.4 μs, 7.54x
TVar: OK (0.14s)
221 μs ± 6.6 μs, 8.71x
Addr: OK (0.75s)
21.0 μs ± 228 ns, 0.83x
Read/write contention with 1000 iterations and 4 threads
Counter: OK (0.17s)
242 μs ± 6.7 μs
IORef inconsistent: OK (0.15s)
103 μs ± 2.7 μs, 0.43x
IORef atomic: OK (2.28s)
1.71 ms ± 36 μs, 7.07x
MVar: OK (0.41s)
12.9 ms ± 338 μs, 53.16x
TMVar: OK (0.29s)
2.08 ms ± 65 μs, 8.61x
TVar: OK (0.17s)
2.23 ms ± 73 μs, 9.21x
Addr: OK (0.58s)
234 μs ± 1.9 μs, 0.97x
Read/write contention with 10000 iterations and 4 threads
Counter: OK (0.17s)
2.24 ms ± 60 μs
IORef inconsistent: OK (9.34s)
1.00 ms ± 31 μs, 0.45x
IORef atomic: OK (70.87s)
22.0 ms ± 820 μs, 9.80x
MVar: OK (0.74s)
126 ms ± 1.9 ms, 56.07x
TMVar: OK (0.33s)
19.0 ms ± 670 μs, 8.48x
TVar: OK (0.09s)
20.8 ms ± 732 μs, 9.27x
Addr: OK (0.18s)
2.45 ms ± 97 μs, 1.09x
Read/write contention with 10 iterations and 6 threads
Counter: OK (6.77s)
23.9 μs ± 813 ns
IORef inconsistent: OK (0.42s)
23.4 μs ± 712 ns, 0.98x
IORef atomic: OK (0.70s)
39.4 μs ± 747 ns, 1.65x
MVar: OK (1.32s)
50.0 μs ± 1.0 μs, 2.09x
TMVar: OK (0.52s)
35.3 μs ± 449 ns, 1.47x
TVar: OK (3.93s)
32.6 μs ± 93 ns, 1.37x
Addr: OK (0.42s)
23.4 μs ± 381 ns, 0.98x
Read/write contention with 100 iterations and 6 threads
Counter: OK (1.20s)
47.8 μs ± 866 ns
IORef inconsistent: OK (0.24s)
26.9 μs ± 983 ns, 0.56x
IORef atomic: OK (1.46s)
483 μs ± 17 μs, 10.10x
MVar: OK (2.01s)
1.86 ms ± 13 μs, 38.97x
TMVar: OK (0.49s)
519 μs ± 4.5 μs, 10.86x
TVar: OK (14.78s)
521 μs ± 14 μs, 10.91x
Addr: OK (0.56s)
43.1 μs ± 1.6 μs, 0.90x
Read/write contention with 1000 iterations and 6 threads
Counter: OK (13.77s)
459 μs ± 8.8 μs
IORef inconsistent: OK (0.23s)
207 μs ± 4.1 μs, 0.45x
IORef atomic: OK (2.10s)
3.48 ms ± 70 μs, 7.58x
MVar: OK (0.64s)
19.3 ms ± 555 μs, 42.03x
TMVar: OK (1.07s)
5.55 ms ± 161 μs, 12.08x
TVar: OK (0.27s)
4.85 ms ± 189 μs, 10.56x
Addr: OK (27.53s)
461 μs ± 4.8 μs, 1.00x
Read/write contention with 10000 iterations and 6 threads
Counter: OK (0.52s)
4.88 ms ± 142 μs
IORef inconsistent: OK (0.25s)
2.25 ms ± 24 μs, 0.46x
IORef atomic: OK (66.59s)
75.5 ms ± 11 ms, 15.46x
MVar: OK (2.85s)
193 ms ± 4.2 ms, 39.54x
TMVar: OK (0.16s)
54.9 ms ± 1.6 ms, 11.25x
TVar: OK (10.04s)
47.1 ms ± 339 μs, 9.64x
Addr: OK (0.52s)
4.94 ms ± 89 μs, 1.01x
Read/write contention with 10 iterations and 8 threads
Counter: OK (1.84s)
27.3 μs ± 1.0 μs
IORef inconsistent: OK (1.98s)
27.5 μs ± 1.0 μs, 1.01x
IORef atomic: OK (0.91s)
56.1 μs ± 787 ns, 2.06x
MVar: OK (0.69s)
94.9 μs ± 3.2 μs, 3.48x
TMVar: OK (1.35s)
56.1 μs ± 1.5 μs, 2.06x
TVar: OK (0.36s)
51.7 μs ± 1.5 μs, 1.89x
Addr: OK (0.24s)
26.3 μs ± 672 ns, 0.96x
Read/write contention with 100 iterations and 8 threads
Counter: OK (1.41s)
63.1 μs ± 1.2 μs
IORef inconsistent: OK (8.38s)
33.3 μs ± 503 ns, 0.53x
IORef atomic: OK (0.59s)
709 μs ± 21 μs, 11.25x
MVar: OK (1.21s)
2.31 ms ± 49 μs, 36.65x
TMVar: OK (4.67s)
667 μs ± 18 μs, 10.58x
TVar: OK (4.81s)
690 μs ± 6.5 μs, 10.94x
Addr: OK (0.69s)
59.6 μs ± 1.7 μs, 0.94x
Read/write contention with 1000 iterations and 8 threads
Counter: OK (0.16s)
583 μs ± 13 μs
IORef inconsistent: OK (1.13s)
304 μs ± 3.5 μs, 0.52x
IORef atomic: OK (93.87s)
7.12 ms ± 158 μs, 12.22x
MVar: OK (1.54s)
24.4 ms ± 541 μs, 41.83x
TMVar: OK (0.09s)
6.32 ms ± 241 μs, 10.84x
TVar: OK (0.67s)
6.36 ms ± 221 μs, 10.92x
Addr: OK (0.56s)
540 μs ± 20 μs, 0.93x
Read/write contention with 10000 iterations and 8 threads
Counter: OK (5.13s)
5.67 ms ± 68 μs
IORef inconsistent: OK (0.16s)
3.14 ms ± 69 μs, 0.55x
IORef atomic: OK (35.04s)
148 ms ± 319 μs, 26.14x
MVar: OK (1.58s)
230 ms ± 8.7 ms, 40.61x
TMVar: OK (26.99s)
65.9 ms ± 4.1 ms, 11.62x
TVar: OK (3.41s)
66.6 ms ± 779 μs, 11.75x
Addr: OK (0.16s)
5.84 ms ± 164 μs, 1.03x
Read/write contention with 10 iterations and 12 threads
Counter: OK (0.54s)
30.5 μs ± 452 ns
IORef inconsistent: OK (0.29s)
32.4 μs ± 820 ns, 1.06x
IORef atomic: OK (7.93s)
70.3 μs ± 436 ns, 2.30x
MVar: OK (0.44s)
232 μs ± 8.7 μs, 7.59x
TMVar: OK (13.74s)
80.4 μs ± 2.3 μs, 2.64x
TVar: OK (1.73s)
79.2 μs ± 2.7 μs, 2.60x
Addr: OK (0.26s)
29.6 μs ± 1.0 μs, 0.97x
Read/write contention with 100 iterations and 12 threads
Counter: OK (0.49s)
98.0 μs ± 1.1 μs
IORef inconsistent: OK (0.66s)
48.2 μs ± 561 ns, 0.49x
IORef atomic: OK (0.88s)
1.07 ms ± 39 μs, 10.89x
MVar: OK (0.86s)
3.18 ms ± 61 μs, 32.48x
TMVar: OK (0.21s)
1.00 ms ± 21 μs, 10.19x
TVar: OK (0.12s)
1.06 ms ± 29 μs, 10.77x
Addr: OK (0.13s)
94.0 μs ± 2.7 μs, 0.96x
Read/write contention with 1000 iterations and 12 threads
Counter: OK (6.78s)
955 μs ± 5.9 μs
IORef inconsistent: OK (0.22s)
466 μs ± 16 μs, 0.49x
IORef atomic: OK (2.46s)
11.1 ms ± 109 μs, 11.62x
MVar: OK (1.14s)
34.3 ms ± 923 μs, 35.98x
TMVar: OK (8.60s)
10.7 ms ± 226 μs, 11.22x
TVar: OK (4.18s)
9.82 ms ± 42 μs, 10.29x
Addr: OK (0.42s)
1.00 ms ± 40 μs, 1.05x
Read/write contention with 10000 iterations and 12 threads
Counter: OK (1.99s)
9.45 ms ± 116 μs
IORef inconsistent: OK (7.78s)
4.74 ms ± 7.4 μs, 0.50x
IORef atomic: OK (82.99s)
304 ms ± 66 ms, 32.15x
MVar: OK (11.29s)
343 ms ± 8.3 ms, 36.34x
TMVar: OK (0.33s)
116 ms ± 1.9 ms, 12.23x
TVar: OK (10.63s)
105 ms ± 1.5 ms, 11.15x
Addr: OK (3.97s)
9.47 ms ± 361 μs, 1.00x
Read/write contention with 10 iterations and 16 threads
Counter: OK (0.20s)
42.1 μs ± 1.3 μs
IORef inconsistent: OK (0.18s)
40.5 μs ± 1.4 μs, 0.96x
IORef atomic: OK (1.08s)
153 μs ± 662 ns, 3.63x
MVar: OK (1.56s)
408 μs ± 9.1 μs, 9.69x
TMVar: OK (2.43s)
128 μs ± 1.9 μs, 3.04x
TVar: OK (0.16s)
135 μs ± 5.0 μs, 3.19x
Addr: OK (0.32s)
40.6 μs ± 940 ns, 0.96x
Read/write contention with 100 iterations and 16 threads
Counter: OK (0.32s)
131 μs ± 2.1 μs
IORef inconsistent: OK (1.68s)
75.4 μs ± 1.6 μs, 0.58x
IORef atomic: OK (4.05s)
1.27 ms ± 45 μs, 9.70x
MVar: OK (4.53s)
4.39 ms ± 85 μs, 33.50x
TMVar: OK (0.16s)
1.48 ms ± 40 μs, 11.30x
TVar: OK (1.20s)
1.45 ms ± 5.5 μs, 11.06x
Addr: OK (4.85s)
131 μs ± 1.3 μs, 1.00x
Read/write contention with 1000 iterations and 16 threads
Counter: OK (0.15s)
1.05 ms ± 32 μs
IORef inconsistent: OK (0.28s)
632 μs ± 12 μs, 0.60x
IORef atomic: OK (53.85s)
16.3 ms ± 1.5 ms, 15.58x
MVar: OK (2.98s)
46.6 ms ± 672 μs, 44.57x
TMVar: OK (0.09s)
15.2 ms ± 472 μs, 14.54x
TVar: OK (0.36s)
14.6 ms ± 354 μs, 13.96x
Addr: OK (0.15s)
1.38 ms ± 47 μs, 1.32x
Read/write contention with 10000 iterations and 16 threads
Counter: OK (0.17s)
13.3 ms ± 301 μs
IORef inconsistent: OK (0.16s)
5.73 ms ± 119 μs, 0.43x
IORef atomic: OK (117.97s)
464 ms ± 38 ms, 34.81x
MVar: OK (14.95s)
481 ms ± 4.5 ms, 36.11x
TMVar: OK (1.80s)
155 ms ± 2.7 ms, 11.65x
TVar: OK (7.20s)
146 ms ± 663 μs, 10.93x
Addr: OK (5.34s)
12.8 ms ± 187 μs, 0.96x
Read/write contention with 10 iterations and 20 threads
Counter: OK (0.19s)
49.3 μs ± 1.9 μs
IORef inconsistent: OK (0.20s)
48.2 μs ± 1.7 μs, 0.98x
IORef atomic: OK (5.11s)
188 μs ± 5.2 μs, 3.82x
MVar: OK (4.08s)
536 μs ± 5.4 μs, 10.86x
TMVar: OK (1.44s)
159 μs ± 531 ns, 3.23x
TVar: OK (0.37s)
176 μs ± 3.0 μs, 3.57x
Addr: OK (0.82s)
50.6 μs ± 1.0 μs, 1.03x
Read/write contention with 100 iterations and 20 threads
Counter: OK (0.10s)
171 μs ± 5.7 μs
IORef inconsistent: OK (0.25s)
90.0 μs ± 2.1 μs, 0.52x
IORef atomic: OK (9.07s)
1.52 ms ± 59 μs, 8.87x
MVar: OK (2.90s)
5.61 ms ± 159 μs, 32.73x
TMVar: OK (1.57s)
1.85 ms ± 54 μs, 10.77x
TVar: OK (0.19s)
1.88 ms ± 28 μs, 10.98x
Addr: OK (0.18s)
169 μs ± 3.8 μs, 0.98x
Read/write contention with 1000 iterations and 20 threads
Counter: OK (0.74s)
1.68 ms ± 65 μs
IORef inconsistent: OK (0.18s)
881 μs ± 24 μs, 0.52x
IORef atomic: OK (73.70s)
21.3 ms ± 600 μs, 12.66x
MVar: OK (0.94s)
61.0 ms ± 1.4 ms, 36.31x
TMVar: OK (7.19s)
18.1 ms ± 472 μs, 10.75x
TVar: OK (7.26s)
18.1 ms ± 494 μs, 10.79x
Addr: OK (0.09s)
1.74 ms ± 43 μs, 1.04x
Read/write contention with 10000 iterations and 20 threads
Counter: OK (6.72s)
16.2 ms ± 174 μs
IORef inconsistent: OK (6.36s)
7.73 ms ± 247 μs, 0.48x
IORef atomic: OK (21.41s)
641 ms ± 264 μs, 39.55x
MVar: OK (38.82s)
609 ms ± 21 ms, 37.53x
TMVar: OK (4.56s)
192 ms ± 648 μs, 11.82x
TVar: OK (1.13s)
193 ms ± 7.5 ms, 11.91x
Addr: OK (0.43s)
16.8 ms ± 581 μs, 1.03x
Read/write contention with 10 iterations and 32 threads
Counter: OK (1.01s)
73.7 μs ± 1.3 μs
IORef inconsistent: OK (0.51s)
72.0 μs ± 777 ns, 0.98x
IORef atomic: OK (0.24s)
178 μs ± 3.8 μs, 2.41x
MVar: OK (0.91s)
919 μs ± 17 μs, 12.47x
TMVar: OK (2.27s)
277 μs ± 4.4 μs, 3.76x
TVar: OK (0.15s)
299 μs ± 10 μs, 4.06x
Addr: OK (0.50s)
71.8 μs ± 2.6 μs, 0.97x
Read/write contention with 100 iterations and 32 threads
Counter: OK (0.29s)
280 μs ± 6.4 μs
IORef inconsistent: OK (1.44s)
150 μs ± 3.2 μs, 0.53x
IORef atomic: OK (0.25s)
1.74 ms ± 56 μs, 6.22x
MVar: OK (1.19s)
9.06 ms ± 269 μs, 32.31x
TMVar: OK (4.80s)
2.98 ms ± 36 μs, 10.64x
TVar: OK (19.94s)
3.10 ms ± 27 μs, 11.04x
Addr: OK (0.55s)
272 μs ± 9.4 μs, 0.97x
Read/write contention with 1000 iterations and 32 threads
Counter: OK (2.27s)
2.69 ms ± 48 μs
IORef inconsistent: OK (2.18s)
1.27 ms ± 23 μs, 0.47x
IORef atomic: OK (62.21s)
35.2 ms ± 4.3 ms, 13.08x
MVar: OK (5.90s)
89.9 ms ± 233 μs, 33.44x
TMVar: OK (2.96s)
30.7 ms ± 470 μs, 11.40x
TVar: OK (5.98s)
30.4 ms ± 124 μs, 11.30x
Addr: OK (0.29s)
2.75 ms ± 93 μs, 1.02x
Read/write contention with 10000 iterations and 32 threads
Counter: OK (0.08s)
28.8 ms ± 1.0 ms
IORef inconsistent: OK (0.32s)
12.9 ms ± 351 μs, 0.45x
IORef atomic: OK (82.13s)
1.180 s ± 65 ms, 41.02x
MVar: OK (128.00s)
982 ms ± 14 ms, 34.15x
TMVar: OK (0.17s)
299 ms ± 7.6 ms, 10.41x
TVar: OK (7.51s)
316 ms ± 12 ms, 11.00x
Addr: OK (1.36s)
27.0 ms ± 871 μs, 0.94x
Read/write contention with 10 iterations and 64 threads
Counter: OK (0.84s)
132 μs ± 3.3 μs
IORef inconsistent: OK (0.44s)
132 μs ± 2.4 μs, 1.00x
IORef atomic: OK (0.18s)
323 μs ± 6.5 μs, 2.45x
MVar: OK (0.47s)
1.89 ms ± 74 μs, 14.33x
TMVar: OK (4.18s)
557 μs ± 1.7 μs, 4.22x
TVar: OK (2.19s)
575 μs ± 1.6 μs, 4.36x
Addr: OK (0.19s)
125 μs ± 2.8 μs, 0.95x
Read/write contention with 100 iterations and 64 threads
Counter: OK (0.28s)
568 μs ± 13 μs
IORef inconsistent: OK (0.32s)
284 μs ± 2.9 μs, 0.50x
IORef atomic: OK (0.21s)
2.62 ms ± 95 μs, 4.62x
MVar: OK (2.37s)
18.1 ms ± 688 μs, 31.81x
TMVar: OK (0.15s)
6.24 ms ± 173 μs, 10.98x
TVar: OK (5.00s)
6.30 ms ± 162 μs, 11.08x
Addr: OK (2.06s)
555 μs ± 7.4 μs, 0.98x
Read/write contention with 1000 iterations and 64 threads
Counter: OK (0.59s)
5.80 ms ± 111 μs
IORef inconsistent: OK (2.10s)
2.49 ms ± 96 μs, 0.43x
IORef atomic: OK (2.36s)
75.8 ms ± 396 μs, 13.06x
MVar: OK (2.88s)
188 ms ± 4.8 ms, 32.41x
TMVar: OK (5.95s)
61.7 ms ± 176 μs, 10.63x
TVar: OK (1.52s)
63.8 ms ± 2.4 ms, 11.00x
Addr: OK (0.28s)
5.52 ms ± 167 μs, 0.95x
Read/write contention with 10000 iterations and 64 threads
Counter: OK (2.74s)
56.6 ms ± 2.2 ms
IORef inconsistent: OK (10.26s)
26.0 ms ± 746 μs, 0.46x
IORef atomic: OK (100.70s)
3.018 s ± 328 ms, 53.35x
MVar: OK (30.56s)
1.994 s ± 16 ms, 35.25x
TMVar: OK (0.36s)
665 ms ± 3.4 ms, 11.76x
TVar: OK (1.79s)
653 ms ± 2.4 ms, 11.55x
Addr: OK (11.06s)
57.0 ms ± 546 μs, 1.01x
Read/write contention with 10 iterations and 128 threads
Counter: OK (0.67s)
240 μs ± 5.9 μs
IORef inconsistent: OK (1.43s)
247 μs ± 8.5 μs, 1.03x
IORef atomic: OK (1.38s)
642 μs ± 6.2 μs, 2.68x
MVar: OK (1.92s)
3.83 ms ± 34 μs, 15.94x
TMVar: OK (16.12s)
1.11 ms ± 23 μs, 4.63x
TVar: OK (4.27s)
1.15 ms ± 36 μs, 4.78x
Addr: OK (0.16s)
230 μs ± 6.9 μs, 0.96x
Read/write contention with 100 iterations and 128 threads
Counter: OK (0.26s)
1.15 ms ± 42 μs
IORef inconsistent: OK (0.31s)
560 μs ± 18 μs, 0.49x
IORef atomic: OK (0.31s)
7.80 ms ± 274 μs, 6.81x
MVar: OK (0.55s)
36.2 ms ± 1.2 ms, 31.54x
TMVar: OK (19.40s)
12.4 ms ± 20 μs, 10.78x
TVar: OK (0.62s)
13.1 ms ± 411 μs, 11.39x
Addr: OK (0.13s)
1.12 ms ± 24 μs, 0.98x
Read/write contention with 1000 iterations and 128 threads
Counter: OK (0.15s)
8.93 ms ± 264 μs
IORef inconsistent: OK (0.52s)
5.28 ms ± 104 μs, 0.59x
IORef atomic: OK (80.71s)
310 ms ± 57 ms, 34.68x
MVar: OK (5.86s)
378 ms ± 3.1 ms, 42.38x
TMVar: OK (1.51s)
131 ms ± 3.5 ms, 14.62x
TVar: OK (3.07s)
129 ms ± 981 μs, 14.44x
Addr: OK (4.47s)
11.3 ms ± 139 μs, 1.27x
Read/write contention with 10000 iterations and 128 threads
Counter: OK (1.37s)
112 ms ± 3.3 ms
IORef inconsistent: OK (5.13s)
53.3 ms ± 571 μs, 0.47x
IORef atomic: OK (11.14s)
6.952 s ± 54 ms, 61.84x
MVar: OK (12.70s)
4.075 s ± 154 ms, 36.25x
TMVar: OK (3.51s)
1.268 s ± 5.7 ms, 11.27x
TVar: OK (0.71s)
1.306 s ± 50 ms, 11.62x
Addr: OK (11.18s)
117 ms ± 1.7 ms, 1.04x
All 309 tests passed (2397.33s)
```
# Memory overhead
In pure Haskell (i.e. with `no-cmm` flag enabled) each unlifted value of
type `Counter` is a singleton mutable array from GHC primitives under
the hood. Thus it occupies at least `platform integer size` + `array
size` + `header` bytes which should typically be at least 3 machine
words. Lifted values may occupy more depending on optimizations.
By default CMM will be enabled which should save one word of overhead
because there would be no array any more hence no need to store
trivial size.
|