Claude Code Audit — Delta Memo

Date: 2026-04-22T15:30:00 | Author: SCOUT | Cycle: 2 of ongoing quarterly cadence

Baseline: F:/TITAN/plans/advisors/CLAUDE-CODE-ARCHITECTURE-DEEP-DIVE-2026-04-22.md

CC version at baseline: 2.1.49 (installed locally) | CC version at audit time: 2.1.117 (latest as of 2026-04-22)

---

Section 1 — What Changed in Claude Code Since Last Audit

The baseline memo documented CC at v2.1.49 (the March 31 leak build). Between then and now CC has shipped 68 releases. The architecturally significant changes are grouped below.

1.1 Hook System — Significant Expansion

The baseline documented 27 hook event types. Since the leak, the following hooks were added or substantially changed (source: official CC changelog at code.claude.com/docs/en/changelog, verified 2026-04-22):

New hooks confirmed since v2.1.49:

PreCompact hook (v2.1.105, 2026-04-13): Allows blocking compaction entirely by exiting with code 2 or returning {"decision": "block"}. This was described in the baseline memo as a theoretical capability but is now live and documented.
TaskCreated hook (v2.1.108, 2026-04-14): Fires when a scheduled task is created. Enables external task tracking without polling.
CwdChanged and FileChanged hooks (v2.1.83, 2026-03-25): Reactive environment management — hooks can now respond to working directory changes and file system events without polling. Enables TITAN to wire environment-specific context injection.
UserPromptSubmit hook with hookSpecificOutput.sessionTitle (v2.1.94, 2026-04-07): Hooks can now set the session title from a hook response — useful for auto-labeling sessions with project context.
Conditional if field on hooks (v2.1.85, 2026-03-26): Hooks now accept a permission-rule-syntax if condition. Hooks only spawn when the condition matches. This materially reduces subprocess overhead for hooks that only matter in specific contexts (e.g., "only fire PreToolUse when the tool is Bash").
PreToolUse can now satisfy AskUserQuestion (v2.1.92, 2026-04-04 / v2.1.110, 2026-04-15): A hook returning updatedInput alongside permissionDecision: "allow" can pre-fill a tool's question response headlessly. This makes fully-automated pipelines possible without interactive prompting.
PreToolUse now receives file_path as absolute path for Write/Edit/Read tools (v2.1.113, 2026-04-17, bug fix from prior behavior).
OpenTelemetry events for slash commands now include command_name, command_source, and an effort attribute on cost/token/api events (v2.1.117, 2026-04-22).

Assessment: TITAN currently has zero hooks configured (confirmed by ~/.claude/hooks/ being empty). The hook system has matured substantially and TITAN is not using it at all. This is the largest gap between TITAN's CC installation and CC's actual capabilities.

1.2 Model Tier — Opus 4.7 + Effort Levels

Claude Opus 4.7 was released 2026-04-16 (source: github.blog/changelog/2026-04-16-claude-opus-4-7-is-generally-available). Key changes relevant to Silent Infinity:

3x higher vision resolution — relevant if Silent Infinity ever adds image input.
xhigh effort level — sits between high and max. The default effort level for API-key users was raised to high globally in v2.1.94. Silent Infinity's Bedrock calls do not currently pass an effort parameter; this means they are at the pre-high default (equivalent to the old default). This may explain any observed quality regression.
1M context window — Opus 4.7 has a 1M token context window (confirmed by v2.1.117 bug fix: "Fixed Opus 4.7 context window calculation 200K → 1M"). Silent Infinity uses Sonnet; not directly actionable but signals model tier expectations are rising.
/ultrareview — parallel multi-agent code review feature. Not Silent Infinity-relevant directly, but the underlying mechanism (parallel sub-agents with isolated context windows, returning only summaries to parent) is Pattern 8 from the baseline now live in a production CC feature.

1.3 Memory and Compaction

/resume on stale large sessions (v2.1.108, 2026-04-14): CC now offers to Haiku-summarize a stale session before re-reading it on resume. This is a direct implementation of what the baseline called "session transcript rehydration with summary injection" — it is now live in CC. Silent Infinity's summarize_session() function (in feedback_monitor.py) already implements the summarization side; the gap is the reconnect/resume surface that triggers it.
/resume performance improvement 67% on 40MB+ sessions (v2.1.116, 2026-04-20).
ENABLE_PROMPT_CACHING_1H env var (v2.1.108, 2026-04-14): Enables 1-hour cache TTL on prompt prefix. This is the cache boundary marker (Layer 4 in the baseline's system prompt analysis) now user-configurable. Silent Infinity's static system prompt already benefits from Bedrock prompt caching; this is a TITAN-side optimization.
Plan files named after prompts (v2.1.111, 2026-04-16): e.g., fix-auth-race-snug-otter.md. This is a UX improvement to plan mode discoverability.
Session recap feature (v2.1.108): context provision on returning to a session, enabled for telemetry-disabled users.

1.4 Native Build Tool Replacement

In v2.1.117 (2026-04-22), native macOS/Linux builds now replace the Glob and Grep tools with embedded bfs and ugrep via the Bash tool for faster searches. This is an architectural shift: two dedicated tools are collapsed into the general-purpose Bash tool with faster native binaries. On Windows (Harnoor's install), this does not apply — the JavaScript-based Glob/Grep tools remain. This has no Silent Infinity relevance.

1.5 MCP Changes

MCP servers deduplicated when configured both locally and via claude.ai connectors (v2.1.83).
Subagents now inherit dynamically-injected MCP tools (v2.1.98, bug fix — prior to this subagents on first turn missed MCP tools).
MCP _meta["anthropic/maxResultSizeChars"] annotation allows overriding the 25K default result size limit up to 500K per tool result (noted in changelog context). Enables large database schema dumps without chunking.
MCP HTTP/SSE connection memory leak fixed: was accumulating 50MB/hr of unreleased buffers (v2.1.97). Not relevant to Silent Infinity but relevant to TITAN's MCP server health.
sandbox.network.deniedDomains setting added (v2.1.113): domain-level network blocking for Bash sandbox.
Concurrent MCP server connections at startup now the default (v2.1.117): faster startup.

1.6 Agent Orchestration

/ultrareview command (v2.1.111): parallel multi-agent code review in the cloud. Each reviewer runs as an isolated sub-agent returning only a summary to parent.
Forked subagents can now be enabled on external builds via CLAUDE_CODE_FORK_SUBAGENT=1 (v2.1.117). Previously forkable subagents were restricted to internal builds.
Agent frontmatter hooks: now fire for main-thread sessions via --agent (v2.1.116).
Subagents running different models no longer incorrectly trigger malware warnings on file reads (v2.1.113, bug fix).
/agents shows ● N running indicator (v2.1.97): live visibility into running subagent count.

1.7 ~/.claude/ Scan — New Artifacts Since Baseline

Local scan of ~/.claude/ shows:

~/.claude/skills/ — 13 skill files across 13 directories (briefing, dream, evolve, feed, learn, monologue, newsletter, pulse, reflect, sense, teach, titan + token-tracker within sense). These are TITAN skills, unchanged from prior sessions.
~/.claude/hooks/ — empty. TITAN has no hooks configured. As noted in 1.1, this is the largest infrastructure gap.
~/.claude/plugins/ — install-counts-cache.json present, confirming plugin system active.
~/.claude/telemetry/ — telemetry failure logs present (two failed-events JSON files). These indicate some telemetry events failed to transmit; not architecturally significant.
~/.claude/statsig/ — feature flag evaluations cache, last-modified-time file, stable ID, session ID. Statsig is the GrowthBook replacement confirmed in the leaked source. The presence of cached evaluations means CC is actively receiving feature flag state from Anthropic.
~/.claude/todos/ — ~60+ agent TODO JSON files. High session churn; TITAN is active.
No new MCP server config files found in ~/.claude/. The existing MCP servers (Gmail at mcp__24297863-1a96-4452-abcc-47632f4984d8__*) are system-injected via the SDK, not file-configured.

---

Section 2 — Silent Infinity Production Regression Audit

2.1 Pattern Checklist Status (14 Patterns from Baseline)

Reading handler.py, memory.py, feedback_monitor.py, system_prompt.py, conversation_store.py, and guardrails.py directly (2026-04-22).

|---|---|---|---|---|

| 2 | System prompt composition | Static single file | Static single file loaded from prompts/system_v1.md. Memory block prepended in handler.py (L5360-5361). Personality layer injected from user_profile. | PARTIAL POSITIVE — memory injection exists; no layered conditional slots |

| 4 | Sub-agent orchestration | Primitive (Chat Sentinel) | Chat Sentinel + fact extractor + correction extractor + session summarizer (all in feedback_monitor.py). Four sentinel subprocesses, all async/fail-soft. | POSITIVE — richer sub-agent pattern |

| 6 | Plan-mode separation | GAP | GAP | UNCHANGED |

| 8 | Skill auto-invocation | GAP | GAP | UNCHANGED |

| 9 | Session transcript rehydration | 40-turn localStorage only | conversation_store.py persists to DDB. Handler rehydrates from DDB (L5244-5245: "hydrate history from DynamoDB"). /me/opener endpoint generates personalized greeting from memory. | POSITIVE — DDB persistence + opener, but no /resume-style stale summary injection |

| 11 | Memory compaction | GAP (TTL-only) | Memory TTL still the only compaction: hot=48h, warm=30d, staging=7d. No graduated pipeline; conversation_store.py has no compaction logic. Context is managed only by the 40-turn window already loaded from DDB. | UNCHANGED — this is now a confirmed gap with measurable risk |

| 12 | Permission / guardrail model | Crisis guard (regex) | guardrails.py has multi-source pattern loading (external JSON + built-in fallback), severity levels 1-4. Layer 1 (Haiku behavioral classification) and Layer 2 (topic hard-denies) from baseline roadmap are NOT yet present. | PARTIAL — richer than baseline but missing Haiku classifier layer |

| 13 | Pre-session briefing | /me/opener (client-side fetch) | /me/opener endpoint live. Memory block injected into system prompt server-side. First-session users get default; returning users get <memory> block. | POSITIVE — most of Pattern 1 from baseline is now shipped |

| 14 | Parallel tool calls | GAP | GAP | UNCHANGED |

2.2 Positive Progress Since Baseline

The following patterns were explicitly identified as gaps in the baseline and have since shipped or substantially improved:

1. Memory layering (Pattern 12) — hot/warm/cold/staging DDB tiers are live in memory.py. This is a significant architectural advance. The implementation follows the CC file-based philosophy but uses DDB instead of the filesystem: fully inspectable, versioned via DDB streams, human-readable content strings.

2. Correction-as-memory (Pattern 2) — extract_correction() uses Haiku to detect behavioral corrections from user turns and persists them cold-tier permanently via put_correction(). This is a more sophisticated implementation than what CC has natively (CC uses VAULT as a human-in-loop step; SI does it automatically every turn).

3. Session transcript rehydration (Pattern 7) — conversation_store.py DDB persistence and handler-level rehydration means sessions persist across devices. The /me/opener endpoint generates a personalized greeting from memory, which is more user-facing than CC's transcript resume mechanism.

4. Sub-agent richness (Pattern 8) — feedback_monitor.py now has four distinct sentinels (Chat Sentinel, fact extractor, correction extractor, session summarizer), all async fire-and-forget. This mirrors CC's sub-agent pattern almost exactly: isolated context, specialized task, summary-only return.

2.3 Regressions — What Moved Away from CC Patterns

No outright regressions were found in this cycle. The codebase has moved forward on 4 of 14 patterns. However, two architectural debts have grown relative to CC's evolution:

Regression candidate 1 — Memory block prepended before system prompt (handler.py L5360-5361).


system_prompt = memory_block + "\n\n" + system_prompt

The CC architecture (confirmed in baseline) inserts CLAUDE.md as a USER message, not as part of the system prompt. Silent Infinity prepends the memory block to the system prompt itself. This means the memory content is subject to Bedrock's system-prompt token pricing and cannot benefit from the prompt-cache optimization (cache boundary sits between system prompt and session-specific content). As memory grows richer, this will increase latency and cost linearly. The CC approach — memory as a late user message, after the cache boundary — is the correct architecture. This was acceptable at low memory richness but is becoming a structural cost risk.

Regression candidate 2 — No effort parameter on Bedrock invocations.

As of v2.1.94 (2026-04-07), CC raised the default effort level to high for API-key and cloud-provider users. Silent Infinity's bedrock_client.py does not pass an extended_thinking or effort parameter. Bedrock Sonnet 4.6 does not expose the same effort parameter as CC's API, so this is a platform difference rather than a code regression. However, it is worth flagging: the model quality bar that CC users experience is now above what Bedrock defaults provide for Sonnet. This is not a code fix but an architectural awareness item.

---

Section 3 — Top 3 Recommendations This Cycle

Each is scoped to under one day of implementation work on the Silent Infinity codebase.

---

Recommendation A — Move memory injection from system prompt prefix to late user message

Problem. handler.py L5360-5361 prepends memory_block to system_prompt. As memory accumulates (corrections, facts, session summaries), this block grows and increases Bedrock cost on every turn. CC's architecture puts CLAUDE.md as a user message after a cache boundary — the memory content is NOT part of the cached prefix and re-read fresh from disk/DB without invalidating the cached system prompt.

Fix. In handler.py, instead of system_prompt = memory_block + "\n\n" + system_prompt, inject the memory block as the first message in the messages array with role: "user" and mark it with a system-style tag:


# Before calling invoke_stream, prepend memory as a synthetic user message
if memory_block:
    memory_message = {"role": "user", "content": f"[Context for this session]\n{memory_block}"}
    messages = [memory_message] + messages

The model treats this as high-authority context (arriving before any real user turn) while the system prompt prefix remains stable and cacheable. Cost impact: at 2000-char memory blocks and 10 turns/session, this saves approximately 20K input tokens per session at Sonnet's $3/MTok rate = ~$0.06/session at scale. At 10K sessions/month = $600/month savings at scale.

Blast radius: handler.py only. One function. Tests exist. Estimated effort: 2 hours.

---

Recommendation B — Wire `PreCompact` and `CwdChanged` hooks for TITAN

Problem. TITAN has zero hooks configured despite CC shipping 5+ new hook types since the baseline (PreCompact, TaskCreated, CwdChanged, FileChanged, conditional if field). TITAN's CLAUDE.md drives a lot of behavior but has no automated session-state management. The PreCompact hook in particular is high-value: it would allow TITAN to inject a fresh memory snapshot from F:/TITAN/knowledge/memory/ at the point just before compaction, ensuring memory survives context pressure. Currently if a TITAN session compacts, all warm-memory context not in CLAUDE.md is lost.

Fix. Create ~/.claude/hooks/ directory and add a settings.json hook configuration with two hooks:

1. PreCompact hook: shell script that reads F:/TITAN/knowledge/memory/hot/ and outputs a systemMessage via stdout JSON. This injects fresh hot memory into the post-compact context.

2. SessionStart hook: injects additionalContext with the current date, TITAN OS status, and any time-sensitive briefing.

The conditional if field (v2.1.85) means hooks can be scoped to only relevant contexts, avoiding subprocess overhead on every tool call.

Blast radius: New files in ~/.claude/hooks/ only. No Silent Infinity code touched. Estimated effort: 3 hours.

---

Recommendation C — Add graduated conversation compaction to `conversation_store.py`

Problem. conversation_store.py currently rehydrates the last 40 turns from DDB with no compaction. Long conversations (>40 turns) arrive truncated. There is no "layer 2 free compression" step: old turns are loaded wholesale or not at all. CC's five-layer compaction pipeline handles this gracefully; SI has none. This means that a user in their 50th turn of an emotional conversation loses context from the first 10 turns entirely, which is a felt discontinuity.

Fix. Add a two-layer compaction function to conversation_store.py:

Layer 1 (free): When loading history, if turn count > 40: load last 30 turns as-is, summarize the oldest 10+ turns into a single synthetic assistant turn using feedback_monitor.summarize_session() (already exists). The synthetic turn is prepended as a <prior_context> block.
Layer 2 (Haiku, cheap): If total tokens (estimated) > 60K: run summarize_session() on the oldest third, collapse it. Cost: ~$0.001 per activation. Trigger only when needed.

Both layers use existing functions. No new infrastructure required.

Blast radius: conversation_store.py + one test update. Estimated effort: 4 hours.

---

Section 4 — Anti-Patterns Observed in CC That Silent Infinity Should Not Copy

Anti-Pattern A — The `/ultrareview` Verbosity Pattern

/ultrareview spawns multiple sub-agents that each produce a full code review. Even with "only summaries return to parent" discipline, the aggregated review output is verbose by design — the goal is thoroughness, not brevity. For Silent Infinity, the mirror must resist the pull toward comprehensiveness. When the Personalization Sentinel or fact extractor returns multiple signals, the temptation is to inject all of them into context. This is wrong. Inject the top-1 or top-2; discard the rest. More context is not always better context for a contemplative product.

Anti-Pattern B — Aggressive Tool-Use as Default

CC defaults to using tools to verify everything: after writing a file, it reads it back. After running tests, it checks exit codes. For a developer tool this is correct. For Silent Infinity, there is no equivalent verification target — there is no "run the test suite on the conversation." The verification-before-claim discipline (Pattern 9 / 14) translates to grounding observations in the user's words, not running a verification tool call. Importing CC's tool-use cadence into SI would produce a product that pauses mid-conversation to "verify" its interpretation of the user's emotional state, which is the antithesis of contemplative presence.

Anti-Pattern C — Committed/Confident Tone at High Certainty

CC's system prompt encodes a direct, committed communication register: "Lead with the answer. One direct sentence beats three hedged ones." This is correct for a coding assistant where the answer is verifiable. For a wellness mirror, this register is actively harmful. A confident assertion about the user's emotional state ("you are grieving your father's absence") that turns out to be wrong damages trust in a way that a hesitant probe ("I wonder if there is grief in what you are describing?") does not. CC's tone pattern should not be copied into system_v1.md. The correct register for Silent Infinity is tentative, inviting, and phenomenologically honest — the opposite of CC's default.

---

Appendix — Sources

1. code.claude.com/docs/en/changelog — official CC changelog (fetched 2026-04-22)

2. github.blog/changelog/2026-04-16-claude-opus-4-7-is-generally-available — Opus 4.7 GA announcement

3. releasebot.io/updates/anthropic/claude-code — April 2026 release summary (fetched 2026-04-22)

4. news.ycombinator.com/item?id=47467922 — "Claude Code and the Great Productivity Panic of 2026" (HN discussion)

5. F:/TITAN/plans/advisors/CLAUDE-CODE-ARCHITECTURE-DEEP-DIVE-2026-04-22.md — baseline memo (SCOUT, 2026-04-22)

6. F:/projects/innerverse/backend/src/handler.py — Silent Infinity production handler (read 2026-04-22)

7. F:/projects/innerverse/backend/src/memory.py — tiered memory module (read 2026-04-22)

8. F:/projects/innerverse/backend/src/feedback_monitor.py — Chat Sentinel + extractors (read 2026-04-22)

9. F:/projects/innerverse/backend/src/conversation_store.py — DDB conversation persistence (read 2026-04-22)

10. F:/projects/innerverse/backend/src/guardrails.py — crisis guardrail layer (read 2026-04-22)

11. C:/Users/Harnoor/.claude/ — local CC installation scan (2026-04-22)

---

Memo path: F:/TITAN/plans/advisors/claude-code-audit-2026-04-22-1530.md

Next scheduled audit: routine runs every 6h per audit-cadence.log entry 2026-04-22T15:09:21