Benchmarks — Relay

How much faster
is Relay?

Two things decide it: Hit ratio and server distance. Hits stay local, misses pay the round-trip.

Hit ratio

Compare against

localhost

same host · unix socket

7.7×

same-host

container · bridge network

19×

same-AZ

same rack · local network

49×

cross-AZ

same region · inter-DC link

203×

log scale · 96 cores · arm64 · Valkey 9.0.4 · Relay 0.30.0 · PHP 8.5.6

Dataset: Real-world WordPress corpus · igbinary + zstd (level 3) · 6,416 values · p50 85 B / max ~650 KB
Method: 2 runs × 3s · 96 workers, warmed · each bar = client p50 ÷ Relay p50 (cache on)
Latency: localhost ~0.01 ms · same-host ~0.02 ms · same-AZ ~0.10 ms · cross-AZ ~0.31 ms
Machine: AWS c8g.24xlarge · 96 vCPU / 192 GiB · arm64 Graviton4
Software: Relay 0.30.0 · PhpRedis 6.3.0 · Predis 3.5.1 · PHP 8.5.6
Cache: Self-hosted Valkey 9.0.4 · same-AZ & cross-AZ peers

Reduce costs

Less data, fewer commands

Same reads, same dataset, but Relay quietly shrinks the two line items that size a hosting bill: provisioned throughput and the cross-AZ data transfer.

Redis Commands

Relay

20,684

PhpRedis

200,000

Relay

20,684

PhpRedis

200,000

Network Traffic

Relay

8.4 MB

PhpRedis

582 MB

Relay

8.4 MB

PhpRedis

582 MB

96 cores · arm64 · Valkey 9.0.4 · Relay 0.30.0 · PHP 8.5.6

Dataset: 6416-key WordPress corpus (igbinary+zstd) · ~200k ops/pass
Method: 96 workers · 2 runs · baseline phpredis · Relay at 95% hit target (90% measured)
Machine: AWS Graviton4 (c8g.24xlarge) · 96 cores · arm64
Software: Relay 0.30.0 · Valkey 9.0.4 · PhpRedis 6.3.0 · PHP 8.5.6 · Amazon Linux 2023

Concurrency

Scales linearly, never bottlenecks

Relay serves hits three orders of magnitude faster from local memory and throughput climbs near-linearly with cores. PhpRedis and Predis pay a network round-trip on every read and flatten against the server's throughput ceiling.

√ scale · 96 cores · arm64 · Valkey 9.0.4 · Relay 0.30.0 · PHP 8.5.6

Dataset: 6416-key realistic corpus · p50 85 B / avg 1.4 KB / max 866 KB
Method: one worker pinned per physical core · 2 runs × 5s · throughput rstdev < 1%
Cache: Relay igbinary, no zstd · maxmemory 1 GiB / LRU
Machine: AWS Graviton4 (c8g.24xlarge) · 96 physical cores · arm64 · single socket · SMT off
Software: Relay 0.30.0 · Valkey 9.0.4 · PhpRedis 6.3.0 · PHP 8.5.6 · Amazon Linux 2023

Side note

There's also Relay Table

No server, no network — just shared memory. Same idea as APCu, except throughput keeps climbing with every core you add.

√ scale · 96 workers · arm64 · Relay 0.30.0 · PHP 8.5.6

Suite: cachewerk/relay BenchmarkTable @ 2e8bab4
Method: Core-pinned workers swept 1→96 · 2 runs × 5s + warmup · rstdev ≤0.63% (APCu ±5.2%)
Machine: AWS Graviton4 (c8g.24xlarge) · 96 physical cores · arm64 · single socket · SMT off
Software: Relay 0.30.0 · APCu 5.1.28 · PHP 8.5.6 · Amazon Linux 2023

How much faster is Relay?

Reduce costs

Concurrency

Side note

How much faster
is Relay?