Throughput Benchmarks

Measuring message publishing throughput under various load conditions.

Metrics

Metric	Description	Unit
`messages_per_second`	Successfully published messages per second	msg/s
`latency_p50`	Median publishing time per message	ms
`latency_p95`	95th percentile latency	ms
`latency_p99`	99th percentile latency (tail)	ms
`errors`	Publishing error count	count

Tests

test_publish_throughput

Basic throughput test. Each thread publishes msg_per_thread messages.

@pytest.mark.parametrize("n_threads,msg_per_thread", [
    (10, 100),   # Quick smoke test: 1K messages
    (100, 100),  # Base case: 10K messages
])
def test_publish_throughput(...)

Methodology:

Create n_threads threads
Each thread acquires a channel from the pool
All threads synchronize via Barrier
Simultaneous start of publishing
Measure time for each publish

Key aspects:

Message size: 100 bytes
Messages published without confirmation (fire-and-forget)
Channel returned to pool after completion

test_publish_throughput_stress

Stress test with large thread count.

@pytest.mark.parametrize("n_threads,msg_per_thread", [
    (300, 50),   # Medium: 15K messages
    (900, 10),   # Stress: 9K messages
])
def test_publish_throughput_stress(...)

Target

At 900 threads, throughput should remain > 1000 msg/s

test_channel_acquire_release_throughput

Tests channel pool operation overhead.

@pytest.mark.parametrize("n_threads,n_iterations", [
    (100, 100),   # Base case
    (500, 50),    # Higher contention
])
def test_channel_acquire_release_throughput(...)

Measures:

Speed of acquire() / release() without actual AMQP operations
Thread synchronization overhead
Internal lock efficiency

Results

Throughput by Threads

Throughput by Configuration

Threads	Messages	Duration	Throughput	P50	P99
10	1,000	0.07s	14,986 msg/s	0.44ms	3.27ms
100	10,000	0.52s	19,157 msg/s	3.35ms	12.88ms
300	15,000	1.41s	10,653 msg/s	3.80ms	739ms
900	9,000	9.11s	988 msg/s	8.22ms	8.5s

Latency Percentiles

Channel Pool Performance

Threads	Operations	Throughput	P99
100	10,000	325,309 ops/s	0.00ms
500	25,000	260,867 ops/s	47.08ms

Interpretation

Why does throughput drop at 900 threads?

Lock contention — all threads compete for single _transport_lock
Context switching — OS spends time switching between threads
TCP buffer saturation — TCP buffers fill faster than data is sent
RabbitMQ backpressure — broker can't accept messages fast enough

Recommendations

Optimal thread count

For maximum throughput, use 50-200 threads. For scaling — increase process count rather than threads per process.

Reproduction

# Throughput tests only
pytest tests/benchmarks/bench_throughput.py -v

# Specific test
pytest tests/benchmarks/bench_throughput.py::TestThroughput::test_publish_throughput -v