Scalability Benchmark

N producer threads push a 4-byte integer into one same queue, N consumer threads pop the integers from the queue. All producers posts 1,000,000 messages in total. Total time to send and receive all the messages is measured. The benchmark is run for from 1 producer and 1 consumer up to (total-number-of-cpus / 2) producers/consumers to measure the scalabilty of different queues.

Latency Benchmark

One thread posts a 4-byte integer to another thread through one queue and waits for a reply from another queue (2 queues in total). The benchmark measures the total time of 100,000 ping-pongs, best of 10 runs. Contention is minimal here (1-producer-1-consumer, 1 element in the queue) to be able to achieve and measure the lowest latency. Reports the average round-trip time.

Systems details

Intel i9-9900KS system

Intel Xeon Gold 6132 system

AMD Ryzen 9 5950X system

Source Code

github.com/max0x7ba/atomic_queue