Version: v1 · 2026-04-20 · HERALD
Authority: binding on every user-facing ship
Review cadence: quarterly + whenever industry standards shift
> Per Harnoor R0068: "What's the test threshold for any feature? Are there any standards to save our ass in case this becomes huge?"
Silent Infinity ships fast. "Fast + honest + accountable" requires that every feature carry a maturity label that tells users (and, if it comes to it, a jury) exactly what state a feature is in and what has been tested.
This is not optional. Every feature that reaches a user is tagged with one of six tiers. Every tier has required evidence. Every tier has a specific transparency-label that must appear in the UI where the feature lives.
| Tier | Label | What it means | Min evidence required |
|---|---|---|---|
| 0 · SKETCH | (not user-visible) | Internal prototype, design doc stage, not running in prod | Design doc + owner |
| 1 · ALPHA | "Alpha — internal only" | Running in prod behind feature flag for Harnoor + 3 max test users | 10+ test sessions + bug log + rollback plan + red-team pass |
| 2 · BETA | "Beta feature — still being tested" | Publicly visible to opt-in users; clear beta disclosure; no guarantees | 100+ test sessions + 2 red-team rounds by ECHO + weekly metrics review + user-facing "known gaps" doc |
| 3 · GA (general availability) | (no banner; default state) | Stable, safe default, full feature set | 1,000+ sessions + 30-day stable metrics + clinical review for safety-critical + legal review |
| 4 · CLINICALLY-VALIDATED | "Clinically reviewed by [partner]" | Third-party clinical advisor has reviewed + signed off | Tier-3 + named external reviewer + published review methodology + annual re-review |
| 5 · DEPRECATED | "Legacy — scheduled for removal" | Still works, no new development, replacement announced | Migration path doc + user notification + 90-day sunset |
Default for any new feature is ALPHA. No feature skips tiers. To promote tier-by-tier, the evidence requirements must be fully met AND HERALD signs off.
Any feature that could plausibly affect user safety (crisis detection, emergency dispatch, content moderation, age gating, data retention, adult-tier access) can only reach Tier 3 or higher after:
Features below Tier 3 cannot be marketed as "clinical", "therapeutic", "medical", or imply equivalence to human care. Ever.
Every feature at Tier 1/2/4/5 carries a visible badge in the UI where it lives:
[ BETA · tested with ~200 sessions · known gaps: X, Y · learn more ]
/safety/features/<slug> listing:- What this feature does
- Tier + date tier was awarded
- Evidence behind the tier (session count, red-team count, reviewer names if any)
- Known gaps / false-positive rate / failure modes
- When the next review happens
- Contact for issues
| Feature | Current tier | Needed to reach GA | ETA to GA |
|---|---|---|---|
| Contemplative chat (/invoke) | BETA | ECHO red-team pass + 1,000 sessions + 30-day stable metrics | 2 weeks |
| Memory persistence (DynamoDB) | BETA | same + external data-security review | 30 days |
| Crisis footer (911 / 988 / findtreatment / findahelpline) | GA (labels are static + verified) | N/A | n/a |
| Crisis-protocol system-prompt section | BETA | crisis-patterns-v1.json applied + clinical review (AFSP) | 60 days |
| Regex crisis patterns (guardrails.py) | ALPHA (failed case-study test 2026-04-20) | rewrite with plain-speech patterns + red-team pass | 1 week |
| OpenTimestamps crisis archive | ALPHA (shipped today; hashes flowing but no verification endpoint) | verification endpoint + external audit | 30 days |
| Post-crisis follow-up flag | ALPHA | live data showing gentle-start works as intended | 30 days |
| /safety page | GA | N/A | n/a |
| 4-variant daily rating widget | BETA | A/B data showing variant works | 30 days |
| Clickwrap consent modal | BETA | legal review of language | 14 days |
| /feedback endpoint | BETA | higher-volume data + schema validation | 30 days |
| PM dashboard (/pm) | ALPHA (internal only) | N/A for user-facing (internal tool) | n/a |
| User profile / language detection | BETA | i18n proper + language switching UX | 30 days |
| "Your Person" emergency contact (future) | SKETCH | entire Tier 1 → 2 → 3 path | TBD post-Harnoor-approval |
| RapidSOS 1-click 911 (future) | SKETCH | requires clinical + legal + red-team at every tier | TBD post-approval |
| Communities feature (Phase 2) | SKETCH | entire path | TBD |
| Ulysses Clause / Crisis Preferences | SKETCH | entire path | TBD |
| Standard | Applicable? | What it costs us if we align |
|---|---|---|
| FDA Software as a Medical Device (SaMD) | NO — we explicitly are not medical. But language must preclude this (ToS §3 + /safety page both cover) | $0 + careful ToS language |
| ISO 62304 (medical device SW lifecycle) | NO for same reason. But its lifecycle stages influenced our tier model | $0 |
| ISO/IEC 23894 (AI risk management) | YES — this is the right standard for a wellness AI chat app. Tier model aligns with its risk classifications | Free reference framework; alignment is documentation work |
| EU AI Act Article 50 (transparency for AI systems) | YES — we need explicit AI disclosure (done in chat UI + <your_nature> in system prompt) | Already compliant |
| WCAG 2.2 AA (accessibility) | YES — should adopt | ~2 weeks of UI audit + remediation |
| SOC 2 Type II (security + availability + confidentiality) | YES if we scale to enterprise customers | $15-40k/yr + 6-month prep; defer until needed |
| CA SB 243 (companion chatbot) | YES — mandatory + current | /safety page live; crisis protocol published (fulfilled) |
| NIST AI Risk Management Framework (RMF) | YES — reference framework, not certified | Free alignment through documentation |
| HIPAA | NO — we're explicitly not a covered entity. But handling conversations means HIPAA-inspired hygiene (encryption at rest, access controls) is good practice | Mostly already there |
/safety/features/X/review-methodology)Before any commit that changes a user-facing feature merges to master:
1. HERALD (or CI hook — future) verifies the feature's current tier is not overstated in UI labels
2. Any new feature that ships to users without a tier badge gets a FEATURE-TIER-MISSING flag in the commit log; HERALD adds the badge in a follow-up commit before the deploy becomes user-visible
3. Quarterly (first of the month): HERALD reviews the feature inventory above. Tiers that haven't been re-evidenced in 90 days get demoted one tier.
Every quarter, published at silentinfinity.com/safety/transparency:
If Silent Infinity becomes the subject of a lawsuit, regulatory inquiry, or media story:
1. The tier label IS the defense. Court: "you shipped X." Defense: "X was labeled BETA in the UI, the transparency page disclosed its state, user accepted by using it." That's the product-liability counterargument.
2. External reviewers are cited, not hidden. If AFSP reviewed the crisis-protocol section and signed off, that review is public. That's the "reasonable-care" argument that closes product-liability suits.
3. The changelog is the audit trail. Every tier transition, every pattern added to crisis-patterns-v1.json, every system-prompt edit → public git commit + changelog entry. If a plaintiff argues we "should have known", the record shows we knew, tested, and iterated.
4. Insurance + legal floor are the last line. Tech E&O with explicit AI endorsement (Embroker / Counterpart); ToS arbitration + limitation-of-liability; 501(c)(3) crisis-layer entity (Innovation 6). All Tier-2 innovations per SCOUT research.
1. Add ALPHA badge to regex-crisis-guardrails in UI until SCOUT's plain-speech patterns ship and ECHO red-teams them
2. Add BETA badge to chat UI (in the /safety page feature inventory, not on the chat bubble itself)
3. Post feature inventory to /safety/features URL (future sub-page — this week)
4. Clickwrap modal mentions "beta" in the disclosures ("This is a beta service. Known gaps documented at /safety/features.")
This standard takes effect 2026-04-21 00:00 UTC. HERALD enforces. Harnoor can override by explicit chat directive; otherwise standard holds.
— HERALD