Desktop Only
These koans require a larger screen. Please visit on a desktop browser.
Koan 5 of 19

What is a Trace?

The metric said something was slow.
The logs said something went wrong.
But neither could say where, or in what order.
incident report
Users report that checkout is slow.
You have metrics and logs from four services: gateway, api, orders, and database. Can you find the bottleneck?
metrics dashboard
842
requests / min
1.2s
latency p99
Request count looks normal. But latency is high.
The metric tells you something is slow — but not which service.
log viewer — last 5 seconds
14:32:01.204 [gateway] incoming POST /checkout
14:32:01.218 [api] validating cart items
14:32:01.307 [database] query inventory_check: OK
14:32:01.412 [orders] reserve_stock called
14:32:01.884 [database] write order_record: OK
14:32:02.003 [orders] confirmation generated
14:32:02.117 [api] response sent: 200 OK
14:32:02.130 [gateway] POST /checkout completed

From the metrics and logs alone, can you tell which service is causing the slowness?

following one request
gateway
api
orders
database
timing breakdown:
gateway
0 – 1200ms
api
14 – 950ms
orders
100 – 800ms
database
140 – 360ms

What you just did — following one request through multiple services — is called a...

trace

A trace follows a single request from beginning to end.

What metrics and logs could not reveal alone, the trace shows.

Continue →