Skip to content

Self-Healing

Your agent recovers from problems on its own. No silent failures, no stale sessions, no unanswered messages.

If a Telegram message goes unanswered for 2+ minutes, an LLM-powered triage nurse activates:

  1. Diagnoses the problem (session crashed, session stalled, session busy)
  2. Treats it (nudge the session, interrupt, or restart)
  3. Verifies recovery
  4. Escalates if treatment fails
Terminal window
curl localhost:4040/triage/status
curl localhost:4040/triage/history

Polls all active sessions every 60 seconds. Detects:

  • Dead sessions — Process no longer running
  • Unresponsive sessions — Running but not producing output
  • Idle sessions — No activity for too long

Coordinates automatic recovery for each case.

When the agent says “working on it” or “give me a minute,” a timer starts. If no follow-up arrives within the expected window, the agent is nudged and the user is notified.

When a fallback activates (e.g., LLM provider unavailable, file write failed), it’s:

  • Logged with full context
  • Reported via Telegram
  • Surfaced in the health dashboard

Never silently swallowed. All catch blocks are audited with zero silent fallbacks allowed.

When context compaction drops a user message mid-session, the agent detects the gap and re-surfaces the unanswered message. No more silent drops during long sessions.