Rule: The prime directive is minimize token usage, not minimize cloud spend.
Why: Harnoor burnt 53% of his Claude Code credits in 2 days (2026-05-03). Routine pattern-matching, file ops, log scans, syntax checks, deploys, and template rendering don't need LLM reasoning — they need shell commands and Python scripts. Burning Max-plan tokens on grep is waste. Cloud hosting on the other hand is cheap and well-managed; no need to penny-pinch S3 or Lambda.
How to apply — before every agent spawn or LLM call, ask in this order:
1. Can Grep / Glob / Read answer it? → use those tools directly, no agent.
2. Can a local Python / Node / PowerShell / bash script do it? → write the script, run it locally.
3. Is this recurring? → schedule it as a local cron / Windows Task / mcp__scheduled-tasks__create_scheduled_task running a local script. The cron itself shouldn't invoke Claude unless the task genuinely needs prose.
4. Can cached / prior work answer it? → check F:/TITAN/knowledge/, recent session outputs, advisor memos.
5. Does this genuinely need LLM reasoning across ambiguous content? → THEN spawn an agent. Prefer Haiku 4.5 → Sonnet → Opus in cost order. Use Bedrock prompt caching and batch APIs where applicable.
Hosting / deployment is NOT covered by this rule — deploy whatever, wherever, on AWS or whatever cloud. The directive only applies to LLM TOKEN consumption during the work that produces the deploy.
Examples — token-cheap routes:
| Task | Bad (tokens) | Good (local CPU) |
|---|---|---|
| File inventory | scout/explore agent | Glob "*/.html" directly |
| Single-file edit | forge agent | Edit tool directly |
| Recurring digest | LLM each run | Python + Jinja2 + cron, LLM only for narrative bits |
| Syntax check | agent reads | node --check, python -m py_compile |
| Data transform / format conversion | batch LLM | local pandas / Python / pandoc |
| "Find all X in directory" | scout agent | Grep pattern |
| Watchdog audit | Sonnet agent | Python script reading logs (saves $12-14/mo) |
| Newsletter HTML assembly | LLM per article | Jinja2 template + structured staging data; LLM only for hero prose |
| Deploy a Lambda | agent | Local PowerShell + AWS CLI |
| Linting / formatting / tests | agent reads | ruff, black, eslint, pytest |
| AWS hosting / S3 / CloudFront | n/a | go ahead — cloud spend is fine |
When agents ARE the right call (LLM is the right tool):
When agents ARE needed, prefer:
cache_point on system prompts)titan-master-batch-nightly)Avoid:
Cloud is not the enemy. Tokens are.