How much faster
is Relay?
Two things decide it: Hit ratio and server distance. Hits stay local, misses pay the round-trip. Never slower, often far faster.
log scale · 96 cores · arm64 · Valkey 9.0.4 · Relay 0.30.0 · PHP 8.5.6
- Dataset: Real-world WordPress corpus · igbinary + zstd (level 3) · 6,416 values · p50 85 B / max ~650 KB
- Method: 2 runs × 3s · 96 workers, warmed · each bar = client p50 ÷ Relay p50 (cache on)
- Latency: localhost ~0.01 ms · same-host ~0.02 ms · same-AZ ~0.10 ms · cross-AZ ~0.31 ms
- Machine: AWS c8g.24xlarge · 96 vCPU / 192 GiB · arm64 Graviton4
- Software: Relay 0.30.0 · PhpRedis 6.3.0 · Predis 3.5.1 · PHP 8.5.6
- Cache: Self-hosted Valkey 9.0.4 · same-AZ & cross-AZ peers
Reduce costs
Less data, fewer commands
Same reads, same dataset, but Relay quietly shrinks the two line items that size a hosting bill: provisioned throughput and the cross-AZ data transfer.
96 cores · arm64 · Valkey 9.0.4 · Relay 0.30.0 · PHP 8.5.6
- Dataset: 6416-key WordPress corpus (igbinary+zstd) · ~200k ops/pass
- Method: 96 workers · 2 runs · baseline phpredis · Relay at 95% hit target (90% measured)
- Machine: AWS Graviton4 (c8g.24xlarge) · 96 cores · arm64
- Software: Relay 0.30.0 · Valkey 9.0.4 · PhpRedis 6.3.0 · PHP 8.5.6 · Amazon Linux 2023
Concurrency
Scales linearly, never bottlenecks
Relay serves hits three orders of magnitude faster from local memory and throughput climbs near-linearly with cores. PhpRedis and Predis pay a network round-trip on every read and flatten against the server's throughput ceiling.
√ scale · 96 cores · arm64 · Valkey 9.0.4 · Relay 0.30.0 · PHP 8.5.6
- Dataset: 6416-key realistic corpus · p50 85 B / avg 1.4 KB / max 866 KB
- Method: one worker pinned per physical core · 2 runs × 5s · throughput rstdev < 1%
- Cache: Relay igbinary, no zstd · maxmemory 1 GiB / LRU
- Machine: AWS Graviton4 (c8g.24xlarge) · 96 physical cores · arm64 · single socket · SMT off
- Software: Relay 0.30.0 · Valkey 9.0.4 · PhpRedis 6.3.0 · PHP 8.5.6 · Amazon Linux 2023
Side note
There's also Relay Table
No server, no network — just shared memory. Same idea as APCu, except throughput keeps climbing with every core you add.
√ scale · in-memory · 96 workers · arm64 · Relay 0.30.0 · PHP 8.5.6
- Suite: cachewerk/relay BenchmarkTable @ 2e8bab4
- Method: Core-pinned workers swept 1→96 · 2 runs × 5s + warmup · rstdev ≤0.63% (APCu ±5.2%)
- Machine: AWS Graviton4 (c8g.24xlarge) · 96 physical cores · arm64 · single socket · SMT off
- Software: Relay 0.30.0 · APCu 5.1.28 · PHP 8.5.6 · Amazon Linux 2023