Daily Health Digest

A production summary designed for daily standups and incident review.

Generated at 08:00 UTC
Top 5 issues
#1

Tool call failed silently after retries

high19 events
#2

Agent forgot prior context in follow-up step

high14 events
#3

Response exceeded cost threshold

medium12 events
#4

Latency spike on retrieval tool

medium9 events
#5

Hallucinated unsupported refund policy

low6 events

Configure Slack webhook

Cost anomalies

  • customer-support-v3 spent 42% above baseline between 13:00-14:00.
  • tool:browser increased average cost/event from $0.0021 to $0.0035.
  • Prompt expansion added +18% tokens in escalation workflow.

Error spike alerts

  • 09:24 UTC: retrieval timeout errors crossed threshold (27 in 5m).
  • 10:40 UTC: malformed JSON output rose 3.1x on release 2026.02.16.2.
  • 11:05 UTC: tool retry loops detected in two production agents.
Model performance comparison
ModelSuccess rateP95 latencyAvg cost / 1k tokens
gpt-4.196.1%1.8s$0.0048
claude-sonnet94.4%2.1s$0.0039
gemini-pro92.8%1.6s$0.0032