Daily Health Digest
A production summary designed for daily standups and incident review.
Generated at 08:00 UTC
Top 5 issues
#1
Tool call failed silently after retries
high19 events
#2
Agent forgot prior context in follow-up step
high14 events
#3
Response exceeded cost threshold
medium12 events
#4
Latency spike on retrieval tool
medium9 events
#5
Hallucinated unsupported refund policy
low6 events
Configure Slack webhook
Cost anomalies
- customer-support-v3 spent 42% above baseline between 13:00-14:00.
- tool:browser increased average cost/event from $0.0021 to $0.0035.
- Prompt expansion added +18% tokens in escalation workflow.
Error spike alerts
- 09:24 UTC: retrieval timeout errors crossed threshold (27 in 5m).
- 10:40 UTC: malformed JSON output rose 3.1x on release 2026.02.16.2.
- 11:05 UTC: tool retry loops detected in two production agents.
Model performance comparison
| Model | Success rate | P95 latency | Avg cost / 1k tokens |
|---|---|---|---|
| gpt-4.1 | 96.1% | 1.8s | $0.0048 |
| claude-sonnet | 94.4% | 2.1s | $0.0039 |
| gemini-pro | 92.8% | 1.6s | $0.0032 |