← Back to all articles

How to Benchmark RPC Providers Correctly

#benchmark #infrastructure #performance #rpc #web3

Most RPC comparisons look convincing on the surface. Numbers are shown, charts are shared, and conclusions are confidently drawn.

Yet in practice, teams repeatedly discover that the “best-performing” provider in a benchmark behaves very differently in production.

This disconnect exists because most RPC benchmarks don’t measure reliability at all. They measure convenience. Or worse, they measure whatever happens to be easiest to collect.

This article explains how to benchmark RPC providers correctly—not to produce impressive charts, but to uncover how systems actually behave under real conditions.


Why Most RPC Benchmarks Are Misleading

The majority of published RPC benchmarks share the same flaws:

  • They focus on a single metric
  • They test unrealistic workloads
  • They ignore variance and degradation
  • They optimize for speed, not correctness

As a result, they answer the wrong question.

The goal of benchmarking is not to prove that one provider is faster on a good day. The goal is to understand how a system behaves when conditions are imperfect—which is when reliability actually matters.


Start by Defining the Workload

Before measuring anything, you must define what you are testing.

An RPC workload is not generic. It depends on:

  • Read-heavy vs write-heavy usage
  • Bursty vs sustained traffic
  • Latency-sensitive vs throughput-oriented clients
  • Single-region vs multi-region access patterns

A provider that performs well for read-only queries may degrade quickly under write pressure. A provider optimized for sustained throughput may struggle with sudden bursts.

If the workload is undefined, the benchmark is meaningless.


Measure Distributions, Not Averages

Average latency is one of the least useful metrics in RPC benchmarking.

A system where:

  • 90% of requests return in 50 ms
  • 10% return in 3 seconds

…can still look “fast” on average.

In practice, that tail latency defines user experience and system stability. This is where RPC infrastructure begins to degrade long before it fails outright—a pattern explored in detail when examining how RPC nodes degrade rather than fail.

Meaningful benchmarks must include:

  • p50, p95, and p99 latency
  • Variance over time
  • Behavior under increasing load

If tail latency is not visible, degradation is already being missed.


Test Burst Behavior Explicitly

Many RPC providers perform well under steady load and fail under bursts.

Burst testing reveals:

  • Queue depth limits
  • Backpressure behavior
  • Cold-path performance
  • Retry amplification effects

A proper benchmark should include:

  • Sudden traffic spikes
  • Ramp-up and ramp-down phases
  • Mixed read/write bursts

Without this, benchmarks only describe ideal conditions—conditions that rarely exist in production.


Watch for Hidden Throttling

Not all rate limits are explicit.

Some providers introduce:

  • Soft throttling
  • Priority queues
  • Client-specific slowdowns
  • Adaptive latency under load

These behaviors are difficult to detect unless benchmarks are designed to surface them.

This is why rate limits are often confused with reliability, even though they primarily conceal capacity constraints rather than solve them.

Benchmarking should look for:

  • Latency increases without errors
  • Throughput plateaus
  • Uneven response times across identical requests

These signals indicate throttling long before failures appear.


Measure Freshness and Consistency

Performance is not only about speed.

For blockchain RPCs, correctness depends on:

  • State freshness
  • Slot or block lag
  • Consistent responses across nodes

A fast response that reflects stale state can be worse than a slow but accurate one.

Benchmarks should therefore include:

  • Measurements of state lag
  • Consistency checks across regions
  • Comparison of read results over short intervals

Ignoring freshness turns performance testing into a race, not a reliability assessment.


Separate Read and Write Paths

Reads and writes stress fundamentally different parts of the system.

Reads test:

  • Caching layers
  • Data propagation
  • Node synchronization

Writes test:

  • Consensus interaction
  • Queueing
  • Backpressure handling

A provider that excels at one may struggle with the other. Benchmarking them together hides this distinction and obscures the real bottlenecks.

Always measure read and write paths independently before combining them.


What a Meaningful RPC Benchmark Looks Like

A useful benchmark is not impressive—it is uncomfortable.

It:

  • Exposes degradation
  • Reveals trade-offs
  • Produces variance, not just numbers
  • Is reproducible and transparent

Most importantly, it answers a practical question:

“How will this system behave when my application is under stress?”

Benchmarks that cannot answer this question are marketing artifacts, not engineering tools.


Benchmarking Is About Understanding, Not Winning

The purpose of benchmarking is not to declare a winner.

It is to understand:

  • Where systems break
  • How they degrade
  • What signals appear first
  • Which metrics actually matter

RPC reliability is not a number you publish. It is a behavior you observe over time.

In the next article, we’ll trace a Web3 request end-to-end to show where latency, inconsistency, and degradation are introduced long before an RPC node ever goes offline.

See also

Ready when you are

Start using RVO today

Create a free API key and send requests immediately. Limits are explicit, upgrades are instant, and nothing is hidden.