Self-Healing
Your agent recovers from problems on its own. No silent failures, no stale sessions, no unanswered messages.
Stall Detection
Section titled “Stall Detection”If a Telegram message goes unanswered for 2+ minutes, an LLM-powered triage nurse activates:
- Diagnoses the problem (session crashed, session stalled, session busy)
- Treats it (nudge the session, interrupt, or restart)
- Verifies recovery
- Escalates if treatment fails
curl localhost:4040/triage/statuscurl localhost:4040/triage/historySession Monitoring
Section titled “Session Monitoring”Polls all active sessions every 60 seconds. Detects:
- Dead sessions — Process no longer running
- Unresponsive sessions — Running but not producing output
- Idle sessions — No activity for too long
Coordinates automatic recovery for each case.
Promise Tracking
Section titled “Promise Tracking”When the agent says “working on it” or “give me a minute,” a timer starts. If no follow-up arrives within the expected window, the agent is nudged and the user is notified.
Loud Degradation
Section titled “Loud Degradation”When a fallback activates (e.g., LLM provider unavailable, file write failed), it’s:
- Logged with full context
- Reported via Telegram
- Surfaced in the health dashboard
Never silently swallowed. All catch blocks are audited with zero silent fallbacks allowed.
Unanswered Message Detection
Section titled “Unanswered Message Detection”When context compaction drops a user message mid-session, the agent detects the gap and re-surfaces the unanswered message. No more silent drops during long sessions.