Coherence Is Safety
Without coherence, autonomous agents are a security risk:
- An agent that doesn’t remember it already sent an email sends it again
- An agent that doesn’t track its own decisions contradicts itself
- An agent without values makes expedient choices
Instar’s safety features are coherence features.
Decision Journaling
Section titled “Decision Journaling”Every significant decision is recorded with reasoning. The agent can explain why it did what it did, and detect when it’s drifting from purpose.
# Query the decision journalcurl localhost:4040/intent/journal
# Check alignment scorecurl localhost:4040/intent/alignmentOperation Safety Gates
Section titled “Operation Safety Gates”External actions are evaluated by an LLM-supervised gate before execution:
- Risk classification — Every external operation scored on mutability, reversibility, and scope
- Bulk deletes and irreversible sends require explicit approval
- Adaptive trust — Trust levels evolve per service based on track record
- New services start supervised — consistent success earns autonomy
Born from a real incident where an AI agent deleted a user’s emails.
Drift Detection
Section titled “Drift Detection”Compares decision patterns across time windows. Detects when behavior shifts from stated purpose. Measures:
- Conflict frequency
- Confidence trends
- Principle consistency
Autonomy Profiles
Section titled “Autonomy Profiles”Trust elevation rewards consistent, value-aligned behavior with increasing independence. Safety that grows with the agent.
Emergency stop is always available — say “stop everything” and the MessageSentinel halts operations before normal routing.
Behavioral Hooks
Section titled “Behavioral Hooks”Structural guardrails that fire automatically:
| Hook | Type | What it does |
|---|---|---|
| Dangerous command guard | PreToolUse (blocking) | Blocks rm -rf, force push, database drops |
| External operation gate | PreToolUse (blocking) | LLM-supervised safety for external calls |
| Grounding before messaging | PreToolUse (advisory) | Forces identity re-read before communication |
| Deferral detector | PreToolUse (advisory) | Catches the agent deferring work it could do |
The Insight
Section titled “The Insight”Every safety feature exists because coherence is the safety mechanism. An agent that knows who it is, who you are, and what you both stand for — that’s an agent you can trust.