How much faster is Relay?

Two things decide it: Hit ratio and server distance. Hits stay local, misses pay the round-trip. Never slower, often far faster.

Hit ratio
Compare
localhost
same host · unix socket
7.7×
same-host
container · bridge network
19×
same-AZ
same rack · local network
49×
cross-AZ
same region · inter-DC link
203×
log scale · 96 cores · arm64 · Valkey 9.0.4 · Relay 0.30.0 · PHP 8.5.6
  • Dataset: Real-world WordPress corpus · igbinary + zstd (level 3) · 6,416 values · p50 85 B / max ~650 KB
  • Method: 2 runs × 3s · 96 workers, warmed · each bar = client p50 ÷ Relay p50 (cache on)
  • Latency: localhost ~0.01 ms · same-host ~0.02 ms · same-AZ ~0.10 ms · cross-AZ ~0.31 ms
  • Machine: AWS c8g.24xlarge · 96 vCPU / 192 GiB · arm64 Graviton4
  • Software: Relay 0.30.0 · PhpRedis 6.3.0 · Predis 3.5.1 · PHP 8.5.6
  • Cache: Self-hosted Valkey 9.0.4 · same-AZ & cross-AZ peers

Reduce costs

Less data, fewer commands

Same reads, same dataset, but Relay quietly shrinks the two line items that size a hosting bill: provisioned throughput and the cross-AZ data transfer.

Redis Commands
PhpRedis
200,000
Relay
-90% commands sent
20,684
Network Traffic
PhpRedis
582 MB
Relay
-99% bytes sent
8.4 MB
96 cores · arm64 · Valkey 9.0.4 · Relay 0.30.0 · PHP 8.5.6
  • Dataset: 6416-key WordPress corpus (igbinary+zstd) · ~200k ops/pass
  • Method: 96 workers · 2 runs · baseline phpredis · Relay at 95% hit target (90% measured)
  • Machine: AWS Graviton4 (c8g.24xlarge) · 96 cores · arm64
  • Software: Relay 0.30.0 · Valkey 9.0.4 · PhpRedis 6.3.0 · PHP 8.5.6 · Amazon Linux 2023

Concurrency

Scales linearly, never bottlenecks

Relay serves hits three orders of magnitude faster from local memory and throughput climbs near-linearly with cores. PhpRedis and Predis pay a network round-trip on every read and flatten against the server's throughput ceiling.

0 1M 10M 50M 100M 200M READS / SEC 1 2 4 8 16 32 64 96 CPU CORES Relay 174.2M/s PhpRedis 171K/s
√ scale · 96 cores · arm64 · Valkey 9.0.4 · Relay 0.30.0 · PHP 8.5.6
  • Dataset: 6416-key realistic corpus · p50 85 B / avg 1.4 KB / max 866 KB
  • Method: one worker pinned per physical core · 2 runs × 5s · throughput rstdev < 1%
  • Cache: Relay igbinary, no zstd · maxmemory 1 GiB / LRU
  • Machine: AWS Graviton4 (c8g.24xlarge) · 96 physical cores · arm64 · single socket · SMT off
  • Software: Relay 0.30.0 · Valkey 9.0.4 · PhpRedis 6.3.0 · PHP 8.5.6 · Amazon Linux 2023

Side note

There's also Relay Table

No server, no network — just shared memory. Same idea as APCu, except throughput keeps climbing with every core you add.

0 1M 10M 50M 100M 150M OPS / SEC 1 2 4 8 16 32 64 96 CONCURRENT WORKERS Relay\Table 135.1M/s APCu 6M/s
√ scale · in-memory · 96 workers · arm64 · Relay 0.30.0 · PHP 8.5.6
  • Suite: cachewerk/relay BenchmarkTable @ 2e8bab4
  • Method: Core-pinned workers swept 1→96 · 2 runs × 5s + warmup · rstdev ≤0.63% (APCu ±5.2%)
  • Machine: AWS Graviton4 (c8g.24xlarge) · 96 physical cores · arm64 · single socket · SMT off
  • Software: Relay 0.30.0 · APCu 5.1.28 · PHP 8.5.6 · Amazon Linux 2023