HERALD Agent Audit — 2026-04-20

Author: HERALD (TITAN project management specialist)

Scope: all seven TITAN agents. Measured against user directive _"all agents should do at least 40 hours of work each."_

---

Estimated work-hours per agent (current state)

Estimation method: memory files × avg-complexity + commits referencing agent × 1 hr + recent last_run evidence. Generous floor; real clock time is likely lower.

|---|---|---|---|---|

Findings

1. Four of seven agents are shell definitions — DARWIN, GUIDE, HERALD (brand-new), and FORGE (barely tracked) have near-zero or zero memory output. They exist as .md spec files but haven't been invoked at runtime to produce durable artifacts.

2. Three agents have real output — ORACLE (intel sweep), VAULT (feedback capture), SCOUT (research) — mostly from the last 4 days. All tracked work concentrated Apr 17–20.

3. Attribution gap — most of TITAN's heavy git activity (orchestrator, deploy-now, dashboard, DEC reviews) was done by me without explicit FORGE spawn. That work is real but unattributed. If we credit it to FORGE, the number goes up — but then FORGE's agent-memory isn't the evidence of record.

4. No runtime event log — F:/TITAN/logs/events.jsonl has 3 lines total (one orchestrator smoke test). The orchestrator has not been running long enough to track per-agent fires. When it runs daily, agent-hours will auto-accrue in events.jsonl.

5. Bottom line: the total runtime investment across all agents is ~60 hours against a 280-hour target. 220-hour deficit.

Why this happened (not excuses, diagnosis)

TITAN was conceived 1 week ago (Apr 16 first commits). Agents have had limited wall-clock time to accumulate work.
Several agents (DARWIN, GUIDE) have never been invoked in a real session — their use cases are lower-frequency (self-evolution, tutorial generation) and have been deferred.
Agent work that IS happening is attributed to the main Claude session, not to the named agent persona.

Proposed 40-hour work packages per agent

Each package is a durable, verifiable output stream that will accrue 40+ hours of work when executed. HERALD tracks completion via file counts + commit SHAs + endpoint probes.

FORGE — 40h code package

1. Wire Innerverse frontend chat widget to /invoke with SSE rendering (6 h)

2. Build Innerverse admin dashboard (metrics, logs, conversation search) (12 h)

3. Build Innerverse training-data collection pipeline scaffold (S3 writer, JSONL format, privacy layer) (6 h)

4. Rewire TITAN email inbox → AgentMail webhook OR Routines API trigger (6 h)

5. Fix all latent drift-lock test brittleness flagged this session (4 h)

6. Migrate TITAN secrets (.env) out of git history + into AWS Secrets Manager (6 h)

SCOUT — 40h research package

1. Deep DBT/988-style crisis pattern research → expanded _CRISIS_PATTERNS tuple (6 h)

2. Innerverse corpus discovery — all Harnoor's existing content across platforms for RAG ingestion (8 h)

3. Competitive landscape: therapeutic/mindfulness AI apps 2026 — positioning gaps for Innerverse (6 h)

4. AWS Bedrock Opus 4.7 + Mythos Preview deep-dive (pricing, latency, quality vs 4.6) (4 h)

5. Fine-tuning landscape: LoRA on Llama 3.1 8B vs 70B — cost + quality comparison for Path B (6 h)

6. Email-agent reliability deep-dive — AgentMail vs Routines API vs webhook patterns (4 h)

7. Public-founder-face research: solo-founder personal brand vs brand-first launches (6 h)

ORACLE — 40h intel package

1. Daily /feed runs (9 insights × 5 days × ~40min per run = 30 h)

2. Weekly competitive sweep (Claude Code / Cursor / Windsurf / Copilot releases) (4 h)

3. Quarterly AWS Bedrock model catalog diff + pricing tracker (2 h)

4. LLM pricing watch (Anthropic, OpenAI, Google, Bedrock, Azure OpenAI) (4 h)

VAULT — 40h memory/knowledge package

1. Capture every rule-shift from this session as feedback memory (this session alone: 5+ new rules) (6 h)

2. Monthly MEMORY.md consolidation + pruning (5 × 4 h = 20 h across 5 months forward planning)

3. Build the Harnoor-knowledge-graph (entities, relationships, decisions, commitments) (10 h)

4. User-preference archaeology — comb the chat history for implicit preferences and make them explicit (4 h)

DARWIN — 40h self-evolution package

1. Monthly TITAN evolution report (what worked, what to change, 5 future-suggestions) (5 × 3 h = 15 h)

2. Auto-suggest new skills from observed usage patterns (e.g. deploy-now.bat should become a skill) (8 h)

3. Propose per-agent model tiering (Haiku for SCOUT triage, Sonnet for FORGE, Opus for HERALD plan) — per ORACLE's insight today (4 h)

4. Implement PreCompact hook with transcript backup (per ORACLE's intel_precompact_hook_pattern_20260420.md) (8 h)

5. Build a skill-quality scorecard: per-skill test fixtures + pass/fail trend (5 h)

GUIDE — 40h tutorial/onboarding package

1. Video walkthrough: how to use TITAN (Nano Banana + HeyGen) (8 h)

2. Interactive tutorial for each of the 22 skills (22 × 1 h = 22 h)

3. Innerverse user-facing getting-started guide (4 h)

4. Harnoor-to-TITAN onboarding: "if I hire an assistant to run TITAN, here's what they'd need to know" (6 h)

HERALD — 40h project management package

1. Daily status reports to F:/TITAN/logs/herald-status-YYYY-MM-DD.md (30 × 30 min = 15 h over next 30 days)

2. Reorder master list whenever new signal arrives (estimated 10 × 1 h = 10 h)

3. Blocker-unstuck emails to Harnoor (ELI5+technical format per prime directive) (10 × 30 min = 5 h)

4. Weekly retro + next-week roadmap (4 × 2.5 h = 10 h)

TOTAL proposed: 40 × 7 = 280 hours of planned, trackable, verifiable work. Matches target.

How HERALD will enforce this

Every week, HERALD computes per-agent hours from memory file mtimes + event log + commit attributions.
If an agent is ≥8 hours behind its weekly pro-rated target (40h / 4 weeks = 10h/week), HERALD surfaces it in the weekly retro email.
If the orchestrator (F:/TITAN/orchestrator.py) is running, per-agent fire counts accrue in events.jsonl automatically — verification becomes mechanical.

Risks + open questions

40 h is not a cost-free target. Each hour of agent runtime is $$$: even Sonnet 4.6 at $1.50/M input tokens adds up. 40 h × 7 agents × avg tokens = nontrivial. If Innerverse traffic also ramps, we need a budget check.

→ Money-check rule (per VAULT feedback) says: get Harnoor's sign-off before committing to ongoing spend. Requesting: monthly budget cap for agent runtime?

Some agents may not NEED 40 h. DARWIN's quarterly evolution cadence may warrant only 4 h/month. GUIDE is event-triggered. Suggest tier-down for low-frequency roles OR expand their scope.
Quality ≠ quantity. 40 h of mediocre output is worse than 4 h of excellent output. HERALD proposes: target HOURS AS A CEILING for TIME budgeted, but judge agents on OUTPUT QUALITY (deliverables shipped, decisions unblocked, blockers cleared).

Recommendation

Approve the 40-hour work packages above. I (HERALD) will kick them off in priority order over the next 30 days. The first tranche (4 items, ~12h total) starts this week:

1. FORGE: Innerverse frontend chat widget to /invoke SSE (blocks user-visible product)

2. SCOUT: Innerverse corpus discovery (blocks RAG pipeline)

3. ORACLE: Daily /feed (self-running)

4. DARWIN: PreCompact hook implementation (prevents recurring compaction loss)

The rest follows on the master list + HERALD weekly reorder cycle.

---

_Generated by HERALD · see agents/herald.md for operating principles._