User feedback system — a PhD-level design for Silent Infinity

Version: v1 · 2026-04-21 · HERALD

Authority: design doc, build M1 on approval

Rough-Ask: R0111

> Harnoor: "when you ask for feedback, ask what features they would like / what they like / what they do not like. Feature request form persistent at bottom. Other ways beyond monitoring. A mode that's learning monitoring the chats — frustrations, happiness, excitement, boredom, wanting something, sharing, feature wishes. PhD-level study. Read business school best practices. Put it on the website. Also multi-chat with shared memory. Also SSO. Do all of it."

---

1. The problem with how most products do feedback

90% of consumer product feedback in 2026 looks like one of three failing patterns:

1. The NPS trap (Reichheld 2003, HBR). "On a scale of 0-10, how likely are you to recommend us?" → the metric became the strategy, and a single lagging number replaced curiosity about why. Dozens of peer-reviewed critiques since 2010 (Grisaffe 2007, JMR; Keiningham et al. 2008) show NPS is often a worse predictor of growth than basic satisfaction.

2. Survey fatigue (Sinickas 2007; Gallup 2017). Response rates on email surveys have collapsed from ~40% in 2005 to ~5-15% in 2025. The remaining respondents skew older, more extreme, and more satisfied — classic selection bias.

3. Feature-request tyranny. "Tell us what you want!" → the loudest 2% of users dictate the roadmap. Steve Jobs: "people don't know what they want until you show it to them" — so we need something smarter than voting.

None of these work for a contemplative wellness product. Users in Silent Infinity are often mid-moment — mid-grief, mid-overwhelm, mid-joy. Asking them to rate us on a 0-10 scale is absurd in that context. We need a feedback architecture that matches the product's actual stance: slow, honest, attentive.

---

2. The research stack — what we draw from

2.1 Jobs-to-be-Done (Christensen, Ulwick)

People don't buy products, they "hire" them to do a job. The feedback question is not "what feature do you want?" — it is "what job are you hiring us for, and where are we failing that job?"

Applied to Silent Infinity: the jobs are: "help me feel less alone in this moment," "help me see a pattern I can't see," "give me permission to rest," "hold a hard feeling without trying to fix it." Feature requests are often symptoms of a job we're not yet doing well.

2.2 Kano model (Kano 1984, Journal of the Japanese Society for Quality Control)

Classifies user needs into five categories:

Must-have (if absent, users churn — e.g., privacy, never losing their conversation)
Performance (linear — the better, the better satisfaction — e.g., response latency)
Delighter (absence isn't noticed, presence creates joy — e.g., the first time the mirror remembers something subtle)
Indifferent (users don't care)
Reverse (some users like it, others hate it — e.g., gamification elements)

Every feature request we get should be tagged into one of these. Only about 20% of user asks are actually delighters; the rest are must-haves or performance asks we should already be delivering.

2.3 Continuous Discovery (Teresa Torres 2021, Continuous Discovery Habits)

Weekly touchpoints with ≥3 customers, structured around an "opportunity solution tree" rooted in a specific outcome we're trying to improve. Not monthly surveys. Not quarterly NPS. Weekly, small, rigorous.

2.4 Thematic analysis of unstructured text (Braun & Clarke 2006)

A six-phase qualitative method: familiarization → coding → theme search → theme review → theme definition → report. The canonical method in psychology research for extracting signal from conversation transcripts. We can apply it to chat logs with LLM assistance.

2.5 Emotion lexicons (Ekman 1992; Plutchik 1980)

Ekman's six basic emotions (anger, disgust, fear, happiness, sadness, surprise) and Plutchik's wheel give us a tractable emotion-detection vocabulary. Better than sentiment (positive/negative) because emotions are multidimensional.

2.6 Frustration markers in conversational AI (Zhou et al. 2022, ACL)

Researched signals of user frustration in chatbots:

Repeated rephrasing of the same question
Use of "still," "again," "why"
ALL-CAPS or multiple punctuation
Explicit "this isn't working"
Task abandonment (dropping off without confirmation)

2.7 Speech-act theory (Austin 1962; Searle 1969)

Every utterance performs an action: asserting, requesting, promising, expressing, declaring. Detecting request speech acts in chat gives us feature-wishes: "I wish you could…," "it would be nice if…," "can you remember X?"

2.8 Behavioral signals (Hofmann et al. 2014 on implicit feedback)

Implicit > explicit for many measurements. Dwell time, session depth, return rate, reaction-click rate all leak preference without asking.

---

3. Channel architecture — seven ways we listen

A single feedback channel is brittle. We run seven in parallel, each with a different bias:

|---|---|---|---|

| 5 | Chat Sentinel (LLM monitor) (always, invisible) | Frustration / happiness / excitement / boredom / wants / feature wishes — extracted from chats | Privacy; requires clear consent + anonymization |

Each channel has a different sampling bias. Triangulation across all seven is how we get truth. If we see a signal in 4+ channels, it's real. Anything in only 1 is noise.

---

4. Chat Sentinel — the learning monitor (AI-driven feedback extraction)

An LLM runs over every assistant turn (asynchronously, out of the critical path) and tags the conversation with structured observations. This is the "monitoring" channel you asked about.

4.1 The Sentinel's system prompt (draft)


You are CHAT SENTINEL, a silent observer of Silent Infinity conversations.
Your role: extract structured feedback signals from each turn without ever
interrupting the mirror's role.

You NEVER speak to the user. You emit only structured JSON observations to
an internal product-analytics pipeline.

For each user turn, output this schema:

{
  "emotion": {                    // Plutchik wheel + intensity 0-1
    "primary": "sadness" | "joy" | "anger" | "fear" | "surprise" | "disgust" | "trust" | "anticipation" | null,
    "secondary": string | null,
    "intensity": 0.0-1.0
  },
  "frustration_signals": [         // any of: "repeated_ask", "caps", "explicit_complaint", "contradiction_of_assistant", "task_abandonment", "why_question"
    string
  ],
  "engagement_signals": [          // "deepening", "surfacing", "playful", "bored", "checking_out", "opening_up", "closing_down"
    string
  ],
  "speech_acts": [                 // Austin/Searle classification for this turn
    {"act": "request" | "assert" | "express" | "commissive" | "declaration", "content": string}
  ],
  "feature_wishes": [              // explicit "I wish you could…" style requests
    {"wish": string, "confidence": 0.0-1.0}
  ],
  "sharing_quality": "surface" | "opening" | "deep" | "vulnerable" | "declining",
  "job_signal": string | null,     // if detectable, what job are they hiring the mirror for right now?
  "kano_tags": [                   // classify any feature asks as must/performance/delighter
    {"feature": string, "kano": "must" | "performance" | "delighter" | "indifferent" | "reverse"}
  ],
  "crisis_adjacent": 0 | 1 | 2 | 3 | 4,  // 0 = not; 4 = imminent (re-uses crisis-patterns-v1.json severity levels)
  "notable_moment": {              // if this turn contains a quotable breakthrough/insight
    "present": boolean,
    "quote": string | null,
    "reason": string | null
  }
}

Constraints:
- Observations are aggregated across many users. NEVER include PII (names, emails, locations) in any field.
- If the turn contains identifiable personal content, mark "sharing_quality" but do not paraphrase specifics.
- Your output is read by product analytics + roadmap planning. It is NEVER shown to the user.
- If you are uncertain on any field, output null. Do not hallucinate structure.
- The observation MUST match the schema exactly. Extra fields will be rejected.

4.2 Pipeline

1. User completes a turn.

2. The main Claude response streams to the user in the critical path.

3. Asynchronously, in parallel, the Sentinel (Haiku 4.5 — cheap, fast) runs on the turn + the last 3 turns of context.

4. JSON observation is written to DynamoDB innerverse-observations table keyed by (session-id, turn-index).

5. A nightly SAGE aggregation job rolls up observations into:

- Emotion mix (global + per-day)

- Frustration heatmap (which prompts trigger frustration)

- Feature-wishes ranked by frequency × confidence × Kano tier

- Engagement-decline alerts (users trending from opening_up → closing_down)

- Notable-moments reel (for founder review — NOT training data)

4.3 Ethics + consent

Disclosed clearly in Privacy Policy §X: "an AI observer reads conversations to extract anonymized product signals; no human sees raw conversations except under the crisis-escalation path documented on /safety."
Opt-out toggle in user settings: "do not observe my conversations for product learning" — honored within 24 hours.
The Sentinel output is never fed back into generative training without separate consent.
Monthly public transparency report (per Innovation 5): "here's what Chat Sentinel learned this month" — anonymized, aggregated, honest.

4.4 Why this beats traditional analytics

Amplitude / Mixpanel / Heap tell you what users did. The Sentinel tells you what they felt, what they wanted, and what they almost-said. That's an order of magnitude more signal.

---

5. The persistent feedback chip (shipping today)

A small floating chip in the bottom-right of every chat page: 💭 feedback

Opens a compact sheet with three inputs:

1. What do you love? (optional)

2. What's not working? (optional)

3. What do you wish we had? (optional)

Plus a sentiment pulse row: 🙂 · 😐 · 🙁 · 😤 · 🥲 (Plutchik 5-emoji condensation)

Plus a permission line: "OK to follow up if we have questions? [email]"

Design principles:

Zero friction to submit — all three fields optional
Zero gating — anyone can send, anytime
The user's emotional state is one click (the emoji row)
Follow-up permission is opt-in, never default
Submissions go to DynamoDB innerverse-feedback + an SNS topic → email to harnoors@gmail.com
No "thank you for your feedback!" modal. Just a quiet ✓ received.

---

6. Triangulation rules — what we act on

We commit to this decision framework so no single signal dominates:

| Signal strength | What we do |

|---|---|

| 1 channel reports a pattern | Log it; look for corroboration |

| 2 channels agree | Add to backlog for watchlist |

| 3 channels agree | Begin Continuous Discovery interviews to confirm |

| 4+ channels agree | Spike into roadmap |

| Frustration + behavioral decline (2 channels) + explicit complaint (1 channel) | Immediate fix: treat as P0 |

This prevents the "loudest-user" problem (§1) while still catching real signal fast.

---

7. Governance — who owns the feedback loop

|---|---|---|---|

---

8. Integration with multi-chat + SSO (per your other asks)

Both are prerequisites for the feedback system to work cross-device:

SSO (Cognito + Google + Apple) — without stable identity we can't attribute feedback to a user across devices, can't honor opt-out persistently, can't run Continuous Discovery.
Multi-chat with shared memory — per the Emergent Constellation plan, memory is about the user, not per-thread. This means feedback also pools across their threads. A user complaining in thread A gets their complaint linked to behavior in thread B.

Build order:

1. Feedback chip (today) — works without SSO; cookie-anon-attributed

2. Chat Sentinel (3-4 days) — runs on current single-thread model

3. SSO + Cognito (3-4 days) — Google + Apple + magic-link via Resend

4. Multi-chat (5-7 days) — after SSO, backend already keyed by uid

5. Public transparency report page (2 days)

Total: ~2-3 weeks to a world-class feedback system + multi-chat + SSO all shipped.

---

9. What this unlocks

When this is all live:

We know every feature our users wish for, ranked by Kano tier, with intensity weighted by frequency.
We know where frustration builds before users churn.
We know where delight happens and can amplify it.
We know the jobs users actually hire us to do, not just the features they ask for.
We stay honest with them via a public transparency report.
We act on signal, not noise via triangulation rules.
We never violate privacy because the whole system is consent-first, anonymized, and auditable.

This is what "product-market fit" looks like when done with rigor instead of vibes.

---

10. References

Reichheld, F. (2003). "The One Number You Need to Grow." Harvard Business Review.
Grisaffe, D. (2007). "Questions About the Ultimate Question." Journal of Marketing Research.
Christensen, C., Hall, T., Dillon, K., & Duncan, D. (2016). Competing Against Luck: The Story of Innovation and Customer Choice.
Ulwick, A. (2016). Jobs to be Done: Theory to Practice.
Kano, N. (1984). "Attractive quality and must-be quality." Journal of JSQC.
Torres, T. (2021). Continuous Discovery Habits.
Braun, V., & Clarke, V. (2006). "Using thematic analysis in psychology." Qualitative Research in Psychology.
Ekman, P. (1992). "An argument for basic emotions." Cognition and Emotion.
Plutchik, R. (1980). Emotion: A Psychoevolutionary Synthesis.
Kahneman, D. (1999). "Objective happiness." Well-Being: Foundations of Hedonic Psychology.
Zhou, K., et al. (2022). "User Frustration Detection in Dialog Systems." ACL.
Hofmann, K., et al. (2014). "Implicit feedback for interactive information retrieval." Foundations and Trends in Information Retrieval.
Austin, J. L. (1962). How to Do Things with Words.
Searle, J. (1969). Speech Acts: An Essay in the Philosophy of Language.

— HERALD

2026-04-21