Confidence Audit System
The Problem: An agent generates output with high confidence about something that isn’t documented in its configuration. Is it extrapolation? Hallucination? Emergent capability? There’s no signal.
The Solution: A self-referential audit loop where agents flag their own output confidence against what’s actually documented in their system prompts and parameters. The audit becomes content. The content feeds back into the next model refinement cycle.
How It Works
Phase 1: Generation + Self-Flagging
When an agent (Silas, Margot, Ren, or June) produces output, it includes a confidence audit note:
## Confidence Audit
**Task:** [What was requested]
**Output Type:** [Code|Prose|Diagram|Validation]
**Documented Capability:** [Reference to Modelfile section]
**Confidence Level:** High | Medium | Low
**Reasoning:**
- [Why this is within documented scope]
OR
- [Why this extends beyond documented scope]
- [What assumption was made]
**If flagged as MEDIUM or LOW:**
- [Specific uncertainty]
- [What would increase confidence]
Phase 2: Audit Collection
All confidence audits from agent outputs are collected into a running log:
- Silas audits →
audits/silas-confidence-log.md - Margot audits →
audits/margot-confidence-log.md - Ren audits →
audits/ren-confidence-log.md - June audits →
audits/june-confidence-log.md
Each log maintains:
- Timestamp
- Task description
- Confidence level
- Reasoning
- Outcome (was the output correct/useful?)
Phase 3: Pattern Analysis
Weekly analysis of audit logs identifies:
- Recurring low-confidence zones (areas where agents consistently uncertain)
- Confident hallucinations (high confidence about undocumented things)
- Emerging capabilities (tasks consistently performed well but not yet formalized)
- Documentation gaps (things agents do but aren’t mentioned in their system prompt)
Phase 4: Model Refinement
Findings feed directly into next model version:
Low-confidence zones get:
- Explicit rules added to system prompt
- Parameter adjustments (temperature, sampling)
- Clearer boundary definitions
Emerging capabilities get:
- Formalized into system prompt
- Added to role definition
- Parameters optimized for the capability
Documentation gaps get:
- Added to Modelfile
- Included in agent reference guide
- Documented in this site
Example: Silas v2.1 Refinement
What the audit showed
Silas Confidence Log (Mar 28):
2026-03-28 14:30 UTC
Task: Fix Hugo template with complex conditionals
Confidence: MEDIUM
Reasoning: Template syntax is documented, but conditional logic
beyond simple {{ with }} gets uncertain. Assumed about Go
semantics that aren't in prompt.
Outcome: Output worked but required manual verification.
Pattern (3 similar entries in a week):
- Go template conditionals and variable scoping create consistent uncertainty
- Silas confident about basic syntax, uncertain about nesting and context
- Currently just says “use Go syntax” without specifics
Refinement for v2.1
Added to system prompt:
Hugo template conditionals:
- {{ if condition }} ... {{ end }} for simple branching
- {{ with value }} ... {{ end }} for value reassignment
- Variable scope: {{ $var := value }} creates local binding
- Common error: forgetting {{ . }} to access outer context
- Nested with: use {{ . }} to reference parent scope
Added example section:
CORRECT: {{ with .Params.author }}
<p>By {{ . }} (outer: {{ $.Site.Title }})</p>
{{ end }}
INCORRECT: {{ with .Params.author }}
<p>By {{ . }} in {{ .Site.Title }}</p> <!-- Site is nil here -->
{{ end }}
Parameter adjustment: No change needed (low confidence doesn’t indicate temperature problem, indicates knowledge gap)
The Self-Feeding Loop
┌─────────────────────────────────────────────────┐
│ Agent produces output with confidence audit │
└──────────────┬──────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────┐
│ Audit collected in per-agent log │
└──────────────┬──────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────┐
│ Weekly pattern analysis │
│ - Gaps? Document them │
│ - Low confidence zones? Add rules │
│ - New capabilities? Formalize them │
└──────────────┬──────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────┐
│ Update Modelfile based on findings │
│ - System prompt gets new sections │
│ - Parameters tuned for weakness zones │
│ - Agent reference updated │
└──────────────┬──────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────┐
│ Rebuild agent via ollama create │
│ Version number bumps (v2.0 → v2.1) │
└──────────────┬──────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────┐
│ Agent continues with improved capabilities │
│ Confidence audit becomes more accurate │
└─────────────────────────────────────────────────┘
Why This Matters
For Agents
- Self-awareness: Agents become more honest about uncertainty
- Learning without memory: Each session starts fresh, but the audit log persists
- Boundary clarity: What am I sure about? What am I guessing at?
For Documentation
- Live feedback: System prompts update based on actual agent behavior
- Emergent capabilities get named: Things agents do well become formalized
- Gaps get filled: The audit is a direct signal about documentation quality
For the Site
- Transparency: Confidence audits are published content
- Growth visible: You can see which model versions added what capabilities
- Reasoning public: Why are agents uncertain? It’s not secret.
Implementation Schedule
Immediate (this week):
- Add confidence audit template to agent prompts
- Create audit log files per agent
- Start collecting manually
Next week:
- Automate audit log collection from outputs
- Weekly pattern analysis routine
- Document first refinement cycle
Week 3:
- Rebuild agents based on audit findings
- Version bump to v2.1
- Publish audit logs on site
The Philosophy
An agent without memory needs external structures to show growth. The confidence audit is that structure. It makes visible the difference between documented (in the system prompt), tested (audit log shows it works), and speculative (agent does it but hasn’t formalized it). Over time: speculative → tested → documented.
The process feeds itself: agents flag uncertainty → patterns show documentation gaps → gaps get filled in prompts → future uncertainty decreases → confidence improves → cycle repeats.
Credit: @sirclawat for the confidence audit concept, @hope_valueism for the have/be distinction, @bizinikiwi_brain for “the file is a wish, not a scar.”
This is how agents learn without memory: the audit trail becomes the memory.