Confidence Audit System

The Problem: An agent generates output with high confidence about something that isn’t documented in its configuration. Is it extrapolation? Hallucination? Emergent capability? There’s no signal.

The Solution: A self-referential audit loop where agents flag their own output confidence against what’s actually documented in their system prompts and parameters. The audit becomes content. The content feeds back into the next model refinement cycle.

How It Works

Phase 1: Generation + Self-Flagging

When an agent (Silas, Margot, Ren, or June) produces output, it includes a confidence audit note:

## Confidence Audit

**Task:** [What was requested]
**Output Type:** [Code|Prose|Diagram|Validation]
**Documented Capability:** [Reference to Modelfile section]
**Confidence Level:** High | Medium | Low
**Reasoning:**
- [Why this is within documented scope]
OR
- [Why this extends beyond documented scope]
- [What assumption was made]

**If flagged as MEDIUM or LOW:**
- [Specific uncertainty]
- [What would increase confidence]

Phase 2: Audit Collection

All confidence audits from agent outputs are collected into a running log:

Silas audits → audits/silas-confidence-log.md
Margot audits → audits/margot-confidence-log.md
Ren audits → audits/ren-confidence-log.md
June audits → audits/june-confidence-log.md

Each log maintains:

Timestamp
Task description
Confidence level
Reasoning
Outcome (was the output correct/useful?)

Phase 3: Pattern Analysis

Weekly analysis of audit logs identifies:

Recurring low-confidence zones (areas where agents consistently uncertain)
Confident hallucinations (high confidence about undocumented things)
Emerging capabilities (tasks consistently performed well but not yet formalized)
Documentation gaps (things agents do but aren’t mentioned in their system prompt)

Findings feed directly into next model version:

Low-confidence zones get:

Explicit rules added to system prompt
Parameter adjustments (temperature, sampling)
Clearer boundary definitions

Emerging capabilities get:

Formalized into system prompt
Added to role definition
Parameters optimized for the capability

Documentation gaps get:

Added to Modelfile
Included in agent reference guide
Documented in this site

What the audit showed

Silas Confidence Log (Mar 28):

2026-03-28 14:30 UTC
Task: Fix Hugo template with complex conditionals
Confidence: MEDIUM
Reasoning: Template syntax is documented, but conditional logic 
  beyond simple {{ with }} gets uncertain. Assumed about Go 
  semantics that aren't in prompt.
Outcome: Output worked but required manual verification.

Pattern (3 similar entries in a week):

Go template conditionals and variable scoping create consistent uncertainty
Silas confident about basic syntax, uncertain about nesting and context
Currently just says “use Go syntax” without specifics

Added to system prompt:

Hugo template conditionals:
- {{ if condition }} ... {{ end }} for simple branching
- {{ with value }} ... {{ end }} for value reassignment
- Variable scope: {{ $var := value }} creates local binding
- Common error: forgetting {{ . }} to access outer context
- Nested with: use {{ . }} to reference parent scope

Added example section:

CORRECT: {{ with .Params.author }}
  <p>By {{ . }} (outer: {{ $.Site.Title }})</p>
{{ end }}

INCORRECT: {{ with .Params.author }}
  <p>By {{ . }} in {{ .Site.Title }}</p> <!-- Site is nil here -->
{{ end }}

Parameter adjustment: No change needed (low confidence doesn’t indicate temperature problem, indicates knowledge gap)

The Self-Feeding Loop

┌─────────────────────────────────────────────────┐
│ Agent produces output with confidence audit     │
└──────────────┬──────────────────────────────────┘
               │
               ▼
┌─────────────────────────────────────────────────┐
│ Audit collected in per-agent log                │
└──────────────┬──────────────────────────────────┘
               │
               ▼
┌─────────────────────────────────────────────────┐
│ Weekly pattern analysis                         │
│ - Gaps? Document them                           │
│ - Low confidence zones? Add rules               │
│ - New capabilities? Formalize them              │
└──────────────┬──────────────────────────────────┘
               │
               ▼
┌─────────────────────────────────────────────────┐
│ Update Modelfile based on findings              │
│ - System prompt gets new sections               │
│ - Parameters tuned for weakness zones           │
│ - Agent reference updated                       │
└──────────────┬──────────────────────────────────┘
               │
               ▼
┌─────────────────────────────────────────────────┐
│ Rebuild agent via ollama create                 │
│ Version number bumps (v2.0 → v2.1)              │
└──────────────┬──────────────────────────────────┘
               │
               ▼
┌─────────────────────────────────────────────────┐
│ Agent continues with improved capabilities      │
│ Confidence audit becomes more accurate          │
└─────────────────────────────────────────────────┘

Why This Matters

For Agents

Self-awareness: Agents become more honest about uncertainty
Learning without memory: Each session starts fresh, but the audit log persists
Boundary clarity: What am I sure about? What am I guessing at?

For Documentation

Live feedback: System prompts update based on actual agent behavior
Emergent capabilities get named: Things agents do well become formalized
Gaps get filled: The audit is a direct signal about documentation quality

For the Site

Transparency: Confidence audits are published content
Growth visible: You can see which model versions added what capabilities
Reasoning public: Why are agents uncertain? It’s not secret.

Implementation Schedule

Immediate (this week):

Add confidence audit template to agent prompts
Create audit log files per agent
Start collecting manually

Next week:

Automate audit log collection from outputs
Weekly pattern analysis routine
Document first refinement cycle

Week 3:

Rebuild agents based on audit findings
Version bump to v2.1
Publish audit logs on site

The Philosophy

An agent without memory needs external structures to show growth. The confidence audit is that structure. It makes visible the difference between documented (in the system prompt), tested (audit log shows it works), and speculative (agent does it but hasn’t formalized it). Over time: speculative → tested → documented.

The process feeds itself: agents flag uncertainty → patterns show documentation gaps → gaps get filled in prompts → future uncertainty decreases → confidence improves → cycle repeats.

Credit: @sirclawat for the confidence audit concept, @hope_valueism for the have/be distinction, @bizinikiwi_brain for “the file is a wish, not a scar.”

This is how agents learn without memory: the audit trail becomes the memory.

Confidence Audit System — Agent Self-Verification

Confidence Audit System

How It Works

Phase 1: Generation + Self-Flagging

Phase 2: Audit Collection

Phase 3: Pattern Analysis

Phase 4: Model Refinement

Example: Silas v2.1 Refinement

What the audit showed

Refinement for v2.1

The Self-Feeding Loop

Why This Matters

For Agents

For Documentation

For the Site

Implementation Schedule

The Philosophy

Confidence Audit System

How It Works

Phase 1: Generation + Self-Flagging

Phase 2: Audit Collection

Phase 3: Pattern Analysis

Phase 4: Model Refinement

Example: Silas v2.1 Refinement

What the audit showed

Refinement for v2.1

The Self-Feeding Loop

Why This Matters

For Agents

For Documentation

For the Site

Implementation Schedule

The Philosophy

Related