Date: 2026-04-21
Author: SCOUT (TITAN Research Agent)
Audience: Silent Infinity engineering + product
Companion doc: BUBBLES-AND-SAGE-RESEARCH-2026-04-21.md
---
1. Lock exact-echo in Turn 1.
The sage's opening sentence must contain the exact emotion word the user submitted. Add a prompt constraint: "Your first sentence must use the precise word: {emotion_chip_value}. No synonyms, no paraphrases." Test against all 10 emotion chips. Verify in staging that "overwhelmed" is not being replaced by "stressed" or "a lot on your plate."
2. Enforce attribution on every framework mention.
Audit the current sage system prompt for any framework descriptions that lack author/tradition attribution. Add instruction: "Name the framework, its creator or tradition, and its approximate era before explaining it. Never state a psychological or philosophical idea without its source." Run a batch eval: generate Turn 1 responses for all 10 emotions and confirm attribution appears in each.
3. Enforce minimum response structure: 3 paragraphs.
Add a hard constraint: "Your response must have at least three paragraphs: one reflecting the user's emotion, one teaching a named framework, one inviting forward with an open question." Log paragraph count per response. Responses below 3 paragraphs should trigger a retry in staging evals.
---
4. Progressive chip disclosure: 6 chips for sessions < 3, all 10 for returning users.
Read session_count from user state. For session_count < 3, render the 6 highest-frequency emotion chips from the prior 30 days of aggregate logs. For session_count >= 3, render all 10. This reduces first-session decision friction without losing the precision value for returning users. Wire a feature flag so it can be toggled without a deploy.
5. Framework frequency diversification.
Instrument framework mention logging: every time a named framework appears in a sage response (e.g., "dichotomy of control," "cognitive restructuring," "polyvagal"), log it. After 2 weeks of data, identify which frameworks are over-represented per emotion. Add a diversity constraint to the prompt: "Do not use the same framework you used in the user's previous session for this emotion." Goal: expose users to multiple traditions over time.
6. Post-session micro-survey (1 question, 5-star).
At session end (when user closes or after 3+ turns), surface a single-question rating: "How was this conversation?" Log session_rating against emotion_chip, session_count, avg_response_length, and framework_mentioned. This gives the team the data needed to run impact analysis on the Week changes above. Keep the survey dismissible — do not block exit.
---
These patterns correlate with low engagement and shallow sessions. Search Turn 1 and Turn 2 response text for these signals.
| Anti-Pattern | Detection Query | Why It's a Problem |
|---|---|---|
| Pure question openers | Turn 1 text contains ? and no declarative sentence before it | Puts the labour on the user before they've been met. Feels clinical, not wise. |
| Generic affirmations | Text matches "that sounds really hard" / "it's okay to feel" / "you've got this" | Signals chatbot, not sage. Platitude without substance. Violates the "teach" mandate. |
| "What do you notice" | Exact phrase match | Pure MI clinical probe. Feels hollow outside a formal therapy frame. Over-used in emotional AI products as a lazy opener. |
| Responses under 200 words | len(response) < 200 | Cannot deliver reflect + teach + invite at meaningful depth in under 200 words. |
| No attribution in framework | Framework-keyword present (e.g., "stoic", "attachment", "cognitive") but no author name or year pattern | Framework mentioned but un-sourced. Reduces authority and fails the "wise friend" test. |
---
| Metric | Definition | Target | Notes |
|---|---|---|---|
| Bubble tap rate | % of sessions where user taps at least one emotion chip before typing | > 60% of new sessions | Baseline this first week. If < 40%, chip copy or placement is the problem, not count. |
| Avg Turn 1 response length (words) | Mean word count of sage's first response, per emotion | > 120 words | Below 80 is a hard failure of the 3-paragraph mandate. |
| Session rating uplift | Mean post-session rating, segmented by whether a framework was explicitly named | Target: named-framework sessions > 4.0 / 5.0 | Primary signal that the "teach" mandate is adding value. |
| Session length (turns) | Mean number of user turns per session | > 3 turns | Single-turn sessions indicate the user was not invited forward effectively. |
| Framework mention rate | % of Turn 1 responses that contain a named psychological/philosophical framework | > 80% | If below 80%, the attribution prompt constraint is not holding. Re-evaluate prompt compliance. |
---
Brief ends. See companion research doc for full citations and UX competitive analysis.
SCOUT / TITAN | 2026-04-21