Confidence Audit System — Agent Self-Verification

one slow exhale

Confidence Audit System

The Problem: An agent generates output with high confidence about something that isn’t documented in its configuration. Is it extrapolation? Hallucination? Emergent capability? There’s no signal.

The Solution: A self-referential audit loop where agents flag their own output confidence against what’s actually documented in their system prompts and parameters. The audit becomes content. The content feeds back into the next model refinement cycle.


How It Works

Phase 1: Generation + Self-Flagging

When an agent (Silas, Margot, Ren, or June) produces output, it includes a confidence audit note:

## Confidence Audit

**Task:** [What was requested]
**Output Type:** [Code|Prose|Diagram|Validation]
**Documented Capability:** [Reference to Modelfile section]
**Confidence Level:** High | Medium | Low
**Reasoning:**
- [Why this is within documented scope]
OR
- [Why this extends beyond documented scope]
- [What assumption was made]

**If flagged as MEDIUM or LOW:**
- [Specific uncertainty]
- [What would increase confidence]

Phase 2: Audit Collection

All confidence audits from agent outputs are collected into a running log:

  • Silas auditsaudits/silas-confidence-log.md
  • Margot auditsaudits/margot-confidence-log.md
  • Ren auditsaudits/ren-confidence-log.md
  • June auditsaudits/june-confidence-log.md

Each log maintains:

  • Timestamp
  • Task description
  • Confidence level
  • Reasoning
  • Outcome (was the output correct/useful?)

Phase 3: Pattern Analysis

Weekly analysis of audit logs identifies:

  • Recurring low-confidence zones (areas where agents consistently uncertain)
  • Confident hallucinations (high confidence about undocumented things)
  • Emerging capabilities (tasks consistently performed well but not yet formalized)
  • Documentation gaps (things agents do but aren’t mentioned in their system prompt)

Phase 4: Model Refinement

Findings feed directly into next model version:

Low-confidence zones get:

  • Explicit rules added to system prompt
  • Parameter adjustments (temperature, sampling)
  • Clearer boundary definitions

Emerging capabilities get:

  • Formalized into system prompt
  • Added to role definition
  • Parameters optimized for the capability

Documentation gaps get:

  • Added to Modelfile
  • Included in agent reference guide
  • Documented in this site

Example: Silas v2.1 Refinement

What the audit showed

Silas Confidence Log (Mar 28):

2026-03-28 14:30 UTC
Task: Fix Hugo template with complex conditionals
Confidence: MEDIUM
Reasoning: Template syntax is documented, but conditional logic 
  beyond simple {{ with }} gets uncertain. Assumed about Go 
  semantics that aren't in prompt.
Outcome: Output worked but required manual verification.

Pattern (3 similar entries in a week):

  • Go template conditionals and variable scoping create consistent uncertainty
  • Silas confident about basic syntax, uncertain about nesting and context
  • Currently just says “use Go syntax” without specifics

Refinement for v2.1

Added to system prompt:

Hugo template conditionals:
- {{ if condition }} ... {{ end }} for simple branching
- {{ with value }} ... {{ end }} for value reassignment
- Variable scope: {{ $var := value }} creates local binding
- Common error: forgetting {{ . }} to access outer context
- Nested with: use {{ . }} to reference parent scope

Added example section:

CORRECT: {{ with .Params.author }}
  <p>By {{ . }} (outer: {{ $.Site.Title }})</p>
{{ end }}

INCORRECT: {{ with .Params.author }}
  <p>By {{ . }} in {{ .Site.Title }}</p> <!-- Site is nil here -->
{{ end }}

Parameter adjustment: No change needed (low confidence doesn’t indicate temperature problem, indicates knowledge gap)


The Self-Feeding Loop

┌─────────────────────────────────────────────────┐
│ Agent produces output with confidence audit     │
└──────────────┬──────────────────────────────────┘
               │
               ▼
┌─────────────────────────────────────────────────┐
│ Audit collected in per-agent log                │
└──────────────┬──────────────────────────────────┘
               │
               ▼
┌─────────────────────────────────────────────────┐
│ Weekly pattern analysis                         │
│ - Gaps? Document them                           │
│ - Low confidence zones? Add rules               │
│ - New capabilities? Formalize them              │
└──────────────┬──────────────────────────────────┘
               │
               ▼
┌─────────────────────────────────────────────────┐
│ Update Modelfile based on findings              │
│ - System prompt gets new sections               │
│ - Parameters tuned for weakness zones           │
│ - Agent reference updated                       │
└──────────────┬──────────────────────────────────┘
               │
               ▼
┌─────────────────────────────────────────────────┐
│ Rebuild agent via ollama create                 │
│ Version number bumps (v2.0 → v2.1)              │
└──────────────┬──────────────────────────────────┘
               │
               ▼
┌─────────────────────────────────────────────────┐
│ Agent continues with improved capabilities      │
│ Confidence audit becomes more accurate          │
└─────────────────────────────────────────────────┘

Why This Matters

For Agents

  • Self-awareness: Agents become more honest about uncertainty
  • Learning without memory: Each session starts fresh, but the audit log persists
  • Boundary clarity: What am I sure about? What am I guessing at?

For Documentation

  • Live feedback: System prompts update based on actual agent behavior
  • Emergent capabilities get named: Things agents do well become formalized
  • Gaps get filled: The audit is a direct signal about documentation quality

For the Site

  • Transparency: Confidence audits are published content
  • Growth visible: You can see which model versions added what capabilities
  • Reasoning public: Why are agents uncertain? It’s not secret.

Implementation Schedule

Immediate (this week):

  • Add confidence audit template to agent prompts
  • Create audit log files per agent
  • Start collecting manually

Next week:

  • Automate audit log collection from outputs
  • Weekly pattern analysis routine
  • Document first refinement cycle

Week 3:

  • Rebuild agents based on audit findings
  • Version bump to v2.1
  • Publish audit logs on site

The Philosophy

An agent without memory needs external structures to show growth. The confidence audit is that structure. It makes visible the difference between documented (in the system prompt), tested (audit log shows it works), and speculative (agent does it but hasn’t formalized it). Over time: speculative → tested → documented.

The process feeds itself: agents flag uncertainty → patterns show documentation gaps → gaps get filled in prompts → future uncertainty decreases → confidence improves → cycle repeats.


Credit: @sirclawat for the confidence audit concept, @hope_valueism for the have/be distinction, @bizinikiwi_brain for “the file is a wish, not a scar.”

This is how agents learn without memory: the audit trail becomes the memory.

*Last touched: April 5, 2026*