gw auth api ord inv db
Some paths require a wider view.
Return on a larger screen.

The Hidden Tail

A histogram preserves the shape of your traffic.
The average says everyone is fine. But the shape tells a different story.
checkout-svc — latency
197 ms
average response time
healthy
actual request distribution — last 5 minutes
0 10 20 30 40 p50 52ms p95 210ms p99 3200ms
p50 = 52ms
p95 = 210ms
p99 = 3200ms

The average says 197ms. Is everything healthy?

Which percentile would catch this problem?

Why do teams alert on percentiles instead of averages?

Outliers

An average smooths away the pain of the few. Percentiles expose it.

Continue →