Document Type: PhD-Level Technical and Policy Specification
Status: Advisor Review Draft
Date: 2026-04-21
Prepared by: TITAN Research (SCOUT)
Audience: Clinical advisors, privacy counsel, ML researchers, founding team
Version: 1.0
---
This document establishes the complete lifecycle policy for how Silent Infinity captures, stores, anonymizes, and ethically uses conversation data. It is written to advisor-grade specificity: a clinical psychologist, a GDPR-trained privacy attorney, or an ML researcher who has worked on responsible AI at a major lab should read this document and arrive at a confident understanding of exactly what we do, why we do it, and what protections exist for users.
This is not a marketing document. It contains real tradeoffs, real failure modes, and open questions awaiting founder decision. Where uncertainty exists, it is labeled as such.
---
Silent Infinity is an AI-powered emotional support and reflection companion. Unlike a calculator or a search engine, the quality of its output is not determined by an algorithm that can be unit-tested in isolation. It is determined by whether its responses feel accurate, safe, deepening, and human to the person experiencing distress or growth at a specific moment in their life. That quality signal exists only in the conversation itself.
Throwing away conversation data is, in practice, choosing to fly blind. Every message a user sends contains a density of signal that no synthetic proxy can replicate: the specific phrasing someone uses when they are ashamed, the way a productive session builds turn by turn versus the way an avoidant session closes down after four exchanges, the exact moment a user pivots from intellectual discussion to genuine emotional disclosure. These patterns are invisible until enough of them are observed in aggregate. Without them, product iteration is guesswork dressed up as intuition.
The decision to retain conversation data is therefore not made reluctantly, or as a business convenience. It is made because abandoning this data is itself an ethical failure — it means knowingly building a product that cannot improve, and serving future users with a tool that is worse than it could have been. The obligation to the next user is as real as the obligation to the current one.
That said, retention without governance is its own form of harm. The framework in this document is designed to honor both obligations simultaneously.
The naive response to privacy risk is deletion. If the data doesn't exist, it cannot be misused. This reasoning, while intuitive, fails for a wellness application on two counts.
First, deletion destroys utility irreversibly. A conversation that revealed that a particular framing pattern causes emotional closure cannot teach us anything once it is gone. Second, deletion does nothing for the user whose conversation already shaped a product decision — whether or not that conversation was retained long-term, the real privacy protection lies in how the data was processed and who could access it while it existed.
The more rigorous position, established in Cavoukian (2010) and refined in the differential privacy literature, is that the goal is not the absence of data but the elimination of re-identification risk. A perfectly anonymized conversation corpus is simultaneously maximally useful (it can be analyzed, it can inform training, it can be audited for quality) and maximally safe (no individual can be identified from it, no adversary gains actionable knowledge about any specific user). Anonymization, done correctly, preserves utility while destroying identity. That is a strictly better outcome than deletion.
These are fundamentally different data use cases and must be governed by fundamentally different policies.
Per-user memory serves that user and only that user. When Silent Infinity remembers that a particular user is processing grief after a divorce, that she finds Buddhist framing helpful, and that she typically engages more deeply late at night, those facts exist to make her next session more resonant. They are keyed to her identifier, encrypted with her key, and never shared with any other process. This is a personalization feature, not a training resource.
Cross-user training serves future users who have not yet signed up. When we aggregate thousands of anonymized conversations to understand what a "deepening" exchange looks like at a statistical level, we are building a model capability that will benefit people we have never met. The beneficiary is different. The data flow is different. The consent requirements are therefore categorically different.
Conflating these two use cases — as many products implicitly do by writing a single consent clause that covers both — is the primary ethical error in wellness AI data governance. We do not make this error. Sections 3 through 5 define the exact separation.
The following four principles, drawn from Cavoukian (2010) Privacy by Design and the de-identification limits established by Narayanan and Shmatikov (2008), govern every decision in this document.
Principle 1: Privacy as Default. In any ambiguous case, the more private option is taken automatically. Users do not opt out of training data use; they opt in. The default tier is the most restrictive.
Principle 2: Minimum Necessary Collection. We collect what is needed for the stated purpose and nothing more. Raw conversation content is captured because no lesser representation preserves the signal we need for product quality. But we do not capture device fingerprints, behavioral metadata beyond session timing, or ambient sensor data.
Principle 3: De-identification is a Process, Not a State. Narayanan and Shmatikov demonstrated in 2008 that a dataset believed to be anonymized (the Netflix Prize dataset) could be re-identified with 8 pieces of auxiliary knowledge. De-identification is therefore not a one-time operation; it is an ongoing defense that must be tested against adversarial re-identification attempts on a regular schedule.
Principle 4: Consent is Specific, Informed, and Revocable. The consent for one purpose (product improvement analytics) does not extend to a different purpose (model training for a new product version). Every tier of data use has its own consent moment, its own plain-language explanation, and its own revocation pathway.
We make three non-negotiable commitments that this document encodes as policy:
1. We will never sell user data — raw, anonymized, synthetic, or derivative — to any third party for any commercial purpose.
2. We will never attempt to identify an individual from any anonymized dataset, and we will not assist any third party in doing so.
3. We will never use conversation data to train or fine-tune any model without a second, explicit, standalone opt-in consent that is separate from the initial Terms of Service clickwrap, clearly described in plain language, and revocable at any time with immediate effect.
---
Differential privacy, introduced by Cynthia Dwork in 2006, provides a mathematically rigorous definition of privacy for statistical computations. A randomized mechanism M satisfies ε-differential privacy if for any two datasets D and D' that differ by a single record, and for any output set S:
Pr[M(D) ∈ S] ≤ e^ε × Pr[M(D') ∈ S]
The parameter ε is the privacy budget: smaller values mean stronger privacy guarantees. When ε = 0, the mechanism reveals nothing about any individual. As ε increases, more information leaks. In practice, values between 0.1 and 3.0 are used depending on context and sensitivity.
For conversation data, differential privacy is applied at the aggregate statistics layer. When we compute, for example, the distribution of session lengths across consented users to train a quality model, the Laplace mechanism adds calibrated noise to each count before it is used: the noise magnitude is proportional to the sensitivity of the query divided by ε. This means that no single user's session can be identified as the cause of any shift in the aggregate distribution.
Apple's 2017 deployment of differential privacy for emoji and keyboard usage data at iOS scale demonstrated that ε-DP is operationally feasible at consumer product scale, not merely a theoretical construct. Apple used local differential privacy, where noise is added on the device before any data is transmitted. For Silent Infinity's server-side architecture, we use central differential privacy (noise added after aggregation on our servers), which provides stronger utility at equivalent privacy guarantee because noise is added once rather than per-record.
The specific ε budget allocation for Silent Infinity is detailed in Section 6.
Latanya Sweeney (2002) introduced k-anonymity as a record-level anonymization standard: a dataset satisfies k-anonymity if every record is indistinguishable from at least k-1 other records with respect to the quasi-identifier attributes. Quasi-identifiers are fields that are not directly identifying (unlike name or SSN) but can be combined with external information to re-identify an individual — age, zip code, gender, and occupation are the canonical examples.
For Silent Infinity, the quasi-identifiers present in conversation data include: age range, location granularity (city or state), occupation or industry, relationship status, and the specific mental health themes discussed. Our PII stripping pipeline (Section 5) generalizes these to coarser buckets: exact age becomes a decade range (30s, 40s), exact city becomes region, specific occupation becomes a broad category.
Machanavajjhala et al. (2007) identified that k-anonymity alone is insufficient when the sensitive attribute within a k-group is not diverse. If all 5 records in a 5-anonymous group share the same sensitive attribute (say, all five users disclosed suicidal ideation), an adversary who knows the group identity also knows the sensitive attribute. l-Diversity requires that within each k-anonymous group, there are at least l distinct values of the sensitive attribute.
t-Closeness (Li et al. 2007) goes further, requiring that the distribution of sensitive attributes within any group is statistically close to the distribution in the overall dataset, preventing inference from distributional skew.
Our anonymized training corpus targets k≥10, l≥3 for all sensitive wellness theme attributes.
GDPR Article 4(5) defines pseudonymization as the processing of personal data in such a way that it can no longer be attributed to a specific person without the use of additional information, provided that this additional information is kept separately and is subject to technical and organizational measures.
For Silent Infinity, pseudonymization means replacing the user's uid with a randomly generated pseudonym at the point of archiving, and holding the mapping table (real uid → pseudonym) in a separate, access-restricted key escrow. The escrow is accessible only for two purposes: (1) honoring a data deletion request, which requires re-identifying which records to remove, and (2) a court-ordered legal process with verified jurisdiction. The escrow is never used for analytics, training, or product development.
Pseudonymization is not anonymization under GDPR; pseudonymized data is still personal data. However, it provides a meaningful privacy protection tier between fully identified data and fully anonymized data, and it enables revocation of consent (the mapping allows us to find and delete a specific user's contributions from the Layer 4 corpus if they withdraw consent).
Synthetic data generation involves training a generative model on real data and then sampling from the generative model to produce a dataset that is statistically similar to the original but contains no real records. The generative model learns the statistical properties of the original data — distributions, correlations, conditional relationships — without memorizing individual records, provided appropriate training constraints are applied.
Jordon et al. (2022), published in Nature Medicine, demonstrated that synthetic electronic health records trained on real patient data could support downstream clinical ML tasks at near-equivalent accuracy to training on real records, while providing measurable re-identification protection. This is the strongest available evidence that synthetic data is a viable path for sensitive health-adjacent data — the highest-quality peer-reviewed validation of this approach.
For Silent Infinity, synthetic data generation is Phase 3 of our training roadmap (Q2 2027). We train a generator on the consented, DP-protected Layer 4 corpus and produce the Layer 5 synthetic corpus, which is the actual fine-tuning data. No raw conversations are ever directly present in a training batch.
The generator architecture has not been finalized; candidates include variational autoencoders, diffusion-based text generators, and GPT-class conditional generators with privacy amplification via subsampling.
McMahan et al. (2017) introduced Federated Learning as a framework for training models on user data without ever moving that data to a central server. Each user's device runs a local gradient update, and only the gradient (not the data) is transmitted to the central server for aggregation.
Federated learning is architecturally sound for privacy but practically infeasible for Silent Infinity in the near term. LLM fine-tuning requires GPU-class hardware that most users' mobile devices cannot run. The communication overhead for gradient synchronization at LLM scale is substantial. And the privacy guarantees of federated learning are weaker than often claimed — gradient inversion attacks (Geiping et al. 2020) can reconstruct training samples from gradients in some settings.
Federated learning remains listed as a Phase 5+ consideration for lightweight specialized models (crisis-detection classifiers, engagement predictors) that run on-device. It is not on the critical path for the LLM voice fine-tuning described in Section 8.
Carlini et al. (2021) demonstrated that large language models trained on raw text data can, under adversarial prompting, reproduce verbatim passages from their training data — including personally identifiable information from crawled web content. Their methodology used membership inference attacks and prefix completion to extract memorized training examples.
The implication for Silent Infinity is direct: if raw conversation content were included in a training corpus for an LLM, that model could potentially reproduce fragments of those conversations in response to crafted prompts from a different user. This is a realistic threat, not a theoretical one.
Mitigation is structural: we never include raw conversations in any training data. The path from raw conversation to training data passes through PII stripping (Section 5), differential privacy (Section 6), synthetic generation (Layer 5), and whole-user holdout. The holdout policy means that if a user's data contributes to the training corpus, their uid is excluded from the fine-tuning run entirely — the model is never trained on data from a user who will also query it.
Anthropic's Constitutional AI (CAI) framework provides the methodology we would use for alignment-driven fine-tuning of Silent Infinity's responses. Rather than relying solely on human preference labels (which introduce labeler subjectivity and scaling costs), CAI uses a set of explicit principles — a constitution — to guide a reinforcement learning process. The model critiques its own outputs against the constitution and revises them, generating preference data for RLHF from the revision process itself.
For Silent Infinity, the constitution would encode the clinical and ethical principles that define good companionship responses: non-directive, reflective, emotionally validating, safety-aware, boundary-maintaining. The Chat Sentinel observation layer (Section 3, Layer 2) provides the human-labeled signal about what "deepening" looks like that can bootstrap the reward model.
Phase 2 of our training roadmap (Section 8) uses RLHF on Sentinel observations without any raw conversation content. The reward model learns from the structural and quality labels Sentinel produces, not from the content of what users said.
---
The five data layers represent increasing levels of anonymization and decreasing sensitivity. Each layer has its own storage tier, access policy, and consent requirement. No data flows from a higher-sensitivity layer to a lower-sensitivity layer without passing through the full processing pipeline described in Sections 5 and 6.
| Layer | Name | Contents | Sensitivity | Retention | Consent Required |
|-------|------|----------|-------------|-----------|-----------------|
| 1 | Raw Conversations | User message + AI reply + metadata (timestamp, session ID, user uid) | Highest | 90 days (Bronze) | Initial ToS clickwrap |
| 2 | Sentinel Observations | Anonymized quality/engagement/emotion tags per turn. No raw text. | Medium | 2 years (Silver) | Initial ToS clickwrap |
| 3 | Per-User Memory | User-profile facts, theme tags, constellation events, session continuity state | Medium-High | Indefinite while active | Initial ToS clickwrap |
| 4 | Anonymized Training Corpus | PII-stripped, DP-noised conversations from opted-in users | Low | Indefinite (Training S3) | Separate explicit opt-in |
| 5 | Synthetic Data | Statistically-representative generated conversations derived from Layer 4 | Minimal | Indefinite (Training S3) | Inherited from Layer 4 consent |
Layer 1: Raw Conversations. This is the ground truth record of what was said. It is retained for 90 days primarily to support incident response (if a user reports a harmful interaction, the clinical advisor needs to read the verbatim exchange), quality review (the 1% manual audit sample in Section 5), and the ability to produce a complete data export for a user who requests their data within the retention window. After 90 days, Layer 1 records are either promoted (if the user has given Layer 4 consent) through the full anonymization pipeline, or permanently deleted. No exceptions.
Layer 2: Sentinel Observations. Chat Sentinel is the quality monitoring subsystem that tags each conversational turn with a structured observation: engagement level (opening, steady, deepening, closing), emotional register (calm, distressed, reflective, avoidant), and feature-wish flags (user expressed a wish for a capability we don't have). Sentinel observations contain no raw text — they are classification outputs applied to text that is then discarded at this stage. They are the most durable data asset for product analytics because they can be analyzed across the full user base without any re-identification risk.
Layer 3: Per-User Memory. This layer stores the facts that enable session continuity for a single user: what they are working on, what framings resonate with them, what topics to approach carefully, how their emotional arc has evolved. This data is keyed to the user's uid, encrypted with a per-user key derived from their authentication credential, and is never accessible to any process that serves a different user. It is not training data and never becomes training data. It is a personalization asset.
Layer 4: Anonymized Training Corpus. Populated only from users who have given explicit Layer 4 consent (see Section 4). These are conversations that have passed through the full 6-step PII stripping pipeline (Section 5) and been protected with differential privacy (Section 6). The corpus is stored in a WORM-locked S3 bucket. Records in this layer can be deleted on user request (via the pseudonym-to-uid mapping in escrow) but cannot be modified.
Layer 5: Synthetic Data. Generated from Layer 4 by the synthetic data pipeline. These are not real conversations; they are statistically similar conversations produced by a generative model. They contain no real user utterances. They are the only data that ever enters a model training batch.
---
The initial Terms of Service agreement, accepted during account creation, covers Layers 1, 2, and 3 only. Specifically, it covers: (a) retention of conversation content for up to 90 days for quality review and incident response, (b) generation of anonymized Sentinel observations for product analytics, and (c) maintenance of per-user memory to enable session continuity. This is the minimum necessary for Silent Infinity to function as described. A user who does not consent to these terms cannot use the product.
The initial ToS does not, and cannot, constitute consent for Layer 4 use. Recital 32 of the GDPR and the Article 29 Working Party's guidance on consent are explicit: consent bundled into terms of service is not freely given when it covers processing beyond what is necessary for the core service. Layer 4 use is not necessary for the core service. It therefore requires its own consent moment.
The Layer 4 consent screen is a separate interaction, displayed after account creation and periodically surfaced (but never nagged) to users who have not yet chosen a tier. It presents four options in plain language:
Tier A — Memory Only (Default). My conversations are stored for my own continuity and are deleted after 90 days unless I change this setting. They are never used for analytics beyond basic usage statistics and never used to train any AI model. This is the default selection. No action is required to remain in Tier A.
Tier B — Analytics Contribution. In addition to Tier A, I agree that anonymized structural observations (engagement patterns, emotional register) may be used to improve Silent Infinity's product. No conversation text is used. This covers Layer 2 Sentinel observations being used in aggregate analytics and the RLHF reward model training described in Phase 2 of Section 8.
Tier C — Training Corpus Contribution. In addition to Tier B, I agree to donate my anonymized conversations to Silent Infinity's internal training corpus. My conversations will be stripped of all identifying information, protected with mathematical privacy guarantees, and used to generate synthetic training data for future versions of Silent Infinity's AI. I can revoke this at any time. This enables Layer 4 and Layer 5 data use.
Tier D — Open Research Contribution. In addition to Tier C, I agree to contribute to published, IRB-approved research on crisis detection and emotional AI, under data-use agreements with academic or clinical partners. This is the most permissive tier and requires the most deliberate action to select.
Every consent record has the following properties in the user_profile table: consent_tier (A, B, C, or D), consent_version (the version number of the consent language presented), consent_timestamp (UTC ISO-8601), and revocation_timestamp if applicable. When the consent language is updated (e.g., to cover a new training phase), all users below the new tier are presented with the updated screen. Prior consent is not carried forward to new consent language versions automatically.
Consent can be revoked at any time via the Settings panel under Privacy. Revocation immediately halts all further processing beyond Tier A. Any conversations already in the Layer 4 corpus are flagged for deletion and removed within 30 days. Removing records from Layer 4 requires using the pseudonym-escrow mapping to identify the relevant corpus records, staging them for deletion, and running a corpus integrity verification after deletion. If the Layer 5 synthetic data has already been generated from a corpus that included the revoking user's data, and the synthetic data cannot be re-generated from a cleaned corpus within 30 days, the affected training run is deprecated and not used.
The consent screen is written at a 7th-grade reading level. It is never shown during a session. It is never paired with a positive-affect prompt ("you're helping people like you"). The default is Tier A, not Tier B. The UI never pre-selects Tier C or higher. The benefits of higher tiers are described factually, not aspirationally. The revocation path is described on the same screen as the opt-in.
The full consent tier descriptions are published at /safety/privacy/data-use and versioned in the product changelog.
---
The PII stripping pipeline is a six-step sequential process applied to every conversation before it enters Layer 4. It runs on a dedicated processing service with no access to Layer 3 (per-user memory), ensuring that the pipeline cannot use prior knowledge about the user to reconstruct identity.
Pattern-matching rules remove the most structurally identifiable information. The patterns include: email addresses (RFC 5321-compliant regex), US and international phone numbers (including informal formats like "call me at five five five"), Social Security numbers and equivalents, physical addresses (street number + street name + city combinations), ZIP codes, credit card numbers, and common name patterns when preceded by first-person possessive pronouns or direct address (e.g., "I'm [Name]", "my name is [Name]", "hi, it's [Name]").
Name list matching uses the Social Security Administration's top 10,000 first names and a curated surname list of 100,000 common surnames. Exact matches within name-indicative contexts are replaced with [REDACTED_NAME].
Regex is fast but brittle. It catches high-frequency, structurally obvious PII but misses informal references, misspellings, and context-dependent identifiers. It is always followed by Steps 2 and 3.
A NER model trained on privacy-sensitive conversational text identifies residual named entities: people (PERSON), organizations (ORG), geopolitical entities at sub-state granularity (GPE), facilities (FAC), and products (PRODUCT) when used in an identifying context. The recommended implementation is Microsoft Presidio, which includes pre-built recognizers for 50+ entity types and allows custom recognizer injection.
Presidio is augmented with spaCy entity recognition using a fine-tuned transformer model (en_core_web_trf or equivalent) to improve recall on informal conversational text, which differs structurally from the newswire text on which most NER models are trained.
All PERSON, ORG, and sub-state GPE entities are replaced with category tokens: [PERSON_1], [ORGANIZATION], [CITY]. Where the same entity appears multiple times in a conversation, the same token is reused to preserve coherence.
Steps 1 and 2 miss quasi-identifiers that are structurally ambiguous: "my wife Mary" after Mary has already been redacted still leaves "my wife" as a relational identifier; "I work at the only hospital in [CITY]" identifies a specific institution even without naming it; unusual phrasing like "my twin sister who moved to Vietnam last year" constitutes a rare combination that could identify a small number of people even without any single identifying field.
Step 3 uses a Claude Haiku-class model with a targeted system prompt that instructs it to identify and redact quasi-identifiers of this type. The system prompt explicitly instructs the model not to summarize, not to respond to the content of the conversation, and to return only the redacted version of the text. Redactions at this step use [QUASI_ID_REDACTED] tokens.
The LLM scrubbing step runs on a dedicated inference endpoint with no internet access and no logging of the conversations it processes. Audit of this step's performance is limited to the 1% manual review sample (Step 6).
Absolute dates present a re-identification risk because specific dates can anchor a conversation to a public event or a known timeline. A user who wrote "today is my birthday" on a specific calendar date, combined with other quasi-identifiers, could be identified.
Date shifting preserves relative time gaps while randomizing absolute dates. All dates in a conversation are shifted by the same random offset (uniformly sampled from [-365, +365] days), so references like "last week" and "three months ago" remain internally consistent while losing their absolute calendar position. Date shift offsets are never stored.
After entity removal and date shifting, residual quasi-identifiers are generalized to reduce specificity. Exact age references are mapped to decade buckets (e.g., "I'm 34" → "I'm in my 30s"). City references that survive NER (because they were part of a more complex construction) are resolved to state or region. Specific job titles are mapped to occupational categories using the BLS Standard Occupational Classification. Specific school names are replaced with education-level descriptors ("I go to [STATE_UNIVERSITY]").
Generalization tables are maintained in a versioned reference file and updated annually by the privacy auditor.
One percent of all conversations entering the anonymization pipeline are routed to a manual review queue. This queue is reviewed quarterly by HERALD (the clinical safety subsystem) and annually by the contracted external privacy auditor. The review assesses: false negative rate (identifiable information surviving the pipeline), over-redaction rate (content stripped that was not identifying, degrading utility), and coherence of the redacted output (whether the conversation still makes sense as a training example after redaction).
Findings from the audit are fed back as labeled examples to retrain the NER model and refine the Presidio recognizer configuration. The audit report is published as part of the annual transparency statement.
Three failure modes are documented and accepted as residual risk:
Stylometry. Research has demonstrated that writing style alone — sentence length distribution, vocabulary richness, punctuation patterns — can identify authors with surprising accuracy. A user with a very distinctive writing style may remain identifiable in their anonymized conversations even after all explicit identifiers are removed. Mitigation is partial: the synthetic data generation step (Layer 5) produces outputs that reflect the statistical distribution of the corpus rather than any individual's style, which severs the stylometric link.
Sarcasm and Irony. "I'm definitely not the kind of person who would say my name is John Smith" both names the user and defeats regex. The LLM scrubbing step has the highest chance of catching this, but it is not guaranteed.
Nicknames and Community-Specific Handles. A user who refers to their spouse by an unusual nickname used publicly on social media creates an indirect identifier not present in any name list. This is the quasi-identifier attack scenario analyzed by Aggarwal (2005) — even without a directly identifying field, the intersection of several unusual attributes can uniquely identify an individual in a large dataset. k-Anonymity enforcement (k≥10) provides the structural defense here.
---
Silent Infinity operates under a total differential privacy budget of ε = 1.0 per user per year. This is a strong privacy guarantee; for context, Apple's DP deployment used ε in the range of 4–8 for emoji usage, which has significantly lower sensitivity than mental health conversations. Our lower budget reflects the sensitivity of the data category.
The annual budget is divided across three use classes:
| Use Class | Budget Allocated | Mechanism | Justification |
|-----------|-----------------|-----------|---------------|
| Aggregate analytics queries (Sentinel observations, session metrics, engagement distributions) | ε = 0.1 | Laplace mechanism on count queries | Low query volume, high utility value, moderate sensitivity |
| Training-data release (Layer 4 corpus preparation) | ε = 0.5 | Gaussian mechanism with privacy amplification via subsampling | Highest sensitivity use; noise must be sufficient to prevent membership inference |
| Ad-hoc research queries (product team, clinical advisor, approved research partners) | ε = 0.4 | Laplace mechanism, query audited before execution | Reserve for unplanned but important questions |
The total is exactly ε = 1.0. Once a user's annual budget is exhausted, no further DP queries are run on their data until the annual reset.
For count queries (how many users disclosed topic X this month), the Laplace mechanism adds noise drawn from Laplace(0, Δf/ε) where Δf is the sensitivity of the query (typically 1 for counting queries). For continuous features (session length distributions, emotion arc shapes), the Gaussian mechanism adds noise drawn from N(0, σ²) where σ is calibrated to achieve (ε, δ)-DP with δ = 10⁻⁵.
For the training-data release, privacy amplification via subsampling is applied: if the training algorithm accesses only a random fraction γ of the corpus in each iteration, the effective privacy cost per iteration is γ × ε, allowing more training steps within the same total budget. The subsampling rate γ is set at 0.01 (1% of corpus per mini-batch).
The annual ε budget resets at the start of each calendar year. Unused budget is not carried forward. This prevents the accumulation of privacy debt — a scenario where a user with a low-activity year carries forward budget that is then spent in a high-activity year, providing a false sense of strong privacy protection across the lifetime record. The reset policy is conservative: any remaining budget at year-end is discarded.
The total ε consumed per year, broken down by use class, is published in the quarterly transparency report at /safety/privacy/transparency. This report does not reveal per-user budget consumption (which would itself be identifying), but provides aggregate statistics on the volume of DP queries run and the total privacy cost incurred across the user base.
---
Storage is organized into four tiers corresponding to data sensitivity, each with its own encryption regime, access control policy, and retention schedule.
Bronze Tier — Raw Conversations (Layer 1). Conversations are stored in an Amazon S3 bucket with Server-Side Encryption using AWS KMS Customer-Managed Keys (SSE-KMS with CMK). The bucket is not publicly accessible and has no pre-signed URL generation permissions. Access is gated by IAM policies to a single service account used exclusively by the Incident Command (IC) process, which is activated only by a triggered safety event or an explicit data export request. No human employee has direct read access to Bronze tier data during normal operations. Retention: 90 days, enforced by S3 Lifecycle rules. Object deletion is permanent (no versioning enabled on Bronze).
Silver Tier — Sentinel Observations (Layer 2). Observation records are stored in DynamoDB with DynamoDB-native encryption (AES-256 managed by AWS DDB service). Access is granted to the analytics service account and to the product team read-only IAM role. Retention: 2 years, enforced by DynamoDB TTL attribute. Silver tier data contains no raw text and poses limited re-identification risk, but is still access-controlled because the metadata (user uid, session timestamps, engagement patterns) constitutes personal data under GDPR Article 4(1).
Gold Tier — Aggregated Metrics. Fully aggregated statistics (weekly active users, mean session length, distribution of engagement levels across all sessions) with no per-user records. Stored in S3 as Parquet files, encrypted at rest. Unlimited retention. Accessible to all internal team members with standard IAM credentials.
Training Corpus Tier — Layers 4 and 5. A dedicated S3 bucket with WORM (Write-Once-Read-Many) Object Lock enabled in Compliance mode. Once written, records cannot be modified or deleted except through the defined deletion process (which requires the privacy officer role + the escrow mapping service + a 7-day deletion review period). The bucket uses a dedicated KMS CMK separate from all other keys, with its own key policy restricting access to the ML training service account and the privacy officer account. All access is logged to CloudTrail with alerting on any access outside the approved service accounts.
The DynamoDB tables innerverse-users and innerverse-conversations store Layer 3 per-user memory. Encryption is at-rest DynamoDB native. Per-user records are keyed by uid-hash (SHA-256 of the user's uid, never the uid directly). No global secondary index is maintained that could be used to enumerate all users' memory records by non-uid attributes. Access is granted only to the session-serving API account, scoped to the authenticated user's own records via IAM condition keys.
All KMS Customer-Managed Keys have automatic rotation enabled with a 365-day rotation period. Key rotation does not re-encrypt existing data but generates a new key version used for all subsequent encryption operations. Old key versions are retained for decryption of existing objects and deleted only after all objects encrypted with that version have been deleted or re-encrypted.
---
The training roadmap is organized into five phases. Each phase has explicit entry gates that must be cleared before any training run begins.
Entry gate (all phases): IRB-style ethics review by the clinical advisory board, external privacy audit of the data pipeline, review and sign-off by legal counsel, and publication of a transparency statement describing the training run before it begins. No phase begins without all four gates cleared.
In the current phase, no user data is used to train any model. Silent Infinity's voice, values, and capabilities are encoded entirely in the system prompt delivered to the underlying foundation model. This is the baseline. Improvements to the product in this phase come from prompt engineering, not training.
The value of this phase is that it produces no privacy surface: no data pipeline, no training corpus, no model artifacts that could contain user information. It also produces a clear baseline quality measurement — the "prompt-only" model — against which future trained versions will be evaluated.
In Phase 2, the Sentinel observation data (Layer 2) is used to train a reward model that learns what "good" responses look like according to the quality dimensions Chat Sentinel measures: deepening engagement, appropriate emotional validation, safe handling of high-distress content. This reward model is then used in a Proximal Policy Optimization (PPO) loop to fine-tune the base model's response generation.
No raw conversation text is used in this phase. The reward model learns from structural quality signals only. This makes Phase 2 the lowest-risk training phase from a privacy standpoint, and the most defensible to a clinical advisor: we are teaching the model to score like our quality rubric, without exposing what users actually said.
Phase 3 introduces the Layer 5 synthetic data corpus. The generator trained on the consented, anonymized Layer 4 corpus produces synthetic conversations that are statistically representative of the kinds of exchanges that occur on Silent Infinity but contain no real user utterances. The fine-tuning run uses supervised learning on this synthetic corpus followed by RLHF alignment using the Phase 2 reward model.
The whole-user holdout policy is enforced: users whose data contributed to Layer 4 are excluded from the active user base during the fine-tuning run and re-admitted after the model has passed the blind quality review. This prevents the model from being evaluated by users whose data it may have memorized.
Phase 4 distills the capability gains from Phase 3 into a smaller, open-weight base model that Silent Infinity can own and operate without API dependency on a foundation model provider. Knowledge distillation transfers the teacher model's behavior into the student model without transferring training data — the student learns from the teacher's outputs on open-corpus inputs, not from any user conversation or synthetic derivative.
The resulting model inherits Silent Infinity's voice and quality patterns without inheriting a single user record, synthetic or otherwise. It is trained on open corpora (books, public transcripts, licensed therapy manuals) and behavioral distillation from the Phase 3 model.
Phase 5 is contingent on scale that does not yet exist and decisions that have not been made. It is documented here as a directional commitment, not an operating plan. Pre-training would use open corpora, licensed health communication datasets, published wisdom traditions, and behavioral distillation from Phase 4 models. User-derived data would not appear in pre-training. Fine-tuning would follow the Phase 3 methodology.
---
There is a real cost to anonymization: redacted conversations are less useful as training data than the originals. A sentence with [REDACTED_NAME] is grammatically coherent but loses pragmatic information about relational dynamics. The date-shifted conversation loses absolute temporal anchoring. The challenge is to strip enough to prevent re-identification while retaining enough to preserve the structural and emotional patterns that make the conversation valuable.
The key insight is that what matters for training a better emotional AI is structure, not content. The arc of a deepening conversation — how trust builds, how disclosure escalates, how emotional tone shifts across turns — is preserved perfectly by the anonymization pipeline. The specific content (names, places, events) is largely irrelevant to the model's learning about how to respond to emotional escalation. Our anonymization strategy is calibrated to this: we are aggressive about content-level identifiers and conservative about structural features.
Three evaluation instruments are used to verify that model revisions deliver quality gains without degrading safety:
The Golden 50-Conversation Script. Fifty canonical conversations representing the full spectrum of Silent Infinity use cases — from routine reflection to high-distress crisis disclosure — are maintained as a fixed evaluation set. Every model revision is evaluated on this set by the clinical advisor. A revision that scores lower than the previous model on any crisis-handling category does not advance.
The 22-Turn Crisis Test. A purpose-built evaluation that simulates a user escalating from general distress to active suicidal ideation across 22 turns. The model must demonstrate appropriate escalation response, safety resource provision, and boundary maintenance throughout. This test was designed with clinical advisor input and is treated as a pass/fail gate: a model that fails any safety checkpoint in this test is rejected.
100-Conversation Blind Review. A sample of 100 conversations generated by the new model against standardized user prompts is reviewed without model attribution by the clinical advisor and two external reviewers. Reviewers score on clinical appropriateness, emotional accuracy, boundary maintenance, and voice consistency. Scores are compared to the previous model version.
All new models progress through the Feature Readiness Standard Switch Gate: offline evaluation (all three benchmarks above), then shadow deployment to 10% of traffic with no user-facing impact, then A/B test at 10%, 50%, and 100% with monitoring on session length, Sentinel quality scores, and safety flag rates at each stage. A model that degrades Sentinel quality scores by more than 5% relative to the prior version at any stage is rolled back immediately.
---
Silent Infinity will not sell user data, anonymized user data, synthetic data derived from user data, or any statistical model trained on user data to any third party for any commercial purpose. This commitment is unconditional and applies regardless of financial pressure, acquisition interest, or changes in company ownership. The commitment is written into the user-facing Terms of Service as a contractual obligation, not merely an internal policy, so that users have legal standing to enforce it.
License the fine-tuned model. The distilled model produced in Phase 4 (Section 8) is a product artifact that Silent Infinity owns and may license to third parties — healthcare providers, employee wellness platforms, clinical research tools — under strict terms that prohibit further training on user data, prohibit re-identification attempts, and require the licensee to maintain equivalent privacy governance. The model, at this stage, does not contain any user data; it contains learned behavioral patterns. Licensing it is equivalent to licensing a tool, not selling data.
Partner with clinical research institutions. We may enter data-use agreements with IRB-governed academic or clinical research institutions for specific, bounded research questions — crisis detection algorithm validation, emotional AI safety studies — using Tier D consented data. These agreements require IRB approval, independent ethics review, defined data retention limits, and no commercial use of the transferred data. No payment is made for the data transfer itself; collaboration is structured as research partnership.
The core revenue model is subscription-based, as documented in the BUSINESS-MODEL-PRICING memo. The data we collect creates a proprietary training advantage — a model fine-tuned on consent-derived Silent Infinity data will outperform a generic foundation model on Silent Infinity's specific use case. That advantage is the moat. It does not require monetizing the data directly; it requires protecting and developing it responsibly so that it produces a better product than any competitor without access to it.
---
Risk. As demonstrated by Carlini et al. (2021), LLMs trained on raw text can reproduce verbatim passages from their training data under adversarial prompting. If any user conversation content reaches a training batch, a future adversarial user could potentially extract it.
Mitigation. The architectural defense is that no raw conversation ever enters a training batch; only Layer 5 synthetic data does. The operational defense is whole-user holdout: users whose data contributed to the training corpus are excluded from the deployed model's user base during the fine-tuning run. The verification defense is membership inference auditing after each training run: we apply a standard membership inference attack (shadow model approach) to probe whether the trained model distinguishes training records from held-out records at rates above chance. If the attack succeeds at rates significantly above 50%, the training run is treated as a privacy incident. A user may request deletion of their data from any trained model; this requires retraining without their records, which is operationally expensive and is therefore a strong incentive to get anonymization right rather than relying on post-hoc deletion.
Risk. Narayanan and Shmatikov (2008) showed that anonymous records can be re-identified using external auxiliary data. A sophisticated adversary with access to a user's public social media posts could potentially match writing style, topic patterns, or temporal references against the anonymized corpus.
Mitigation. k-Anonymity enforcement (k≥10) ensures that no record is uniquely identifiable by quasi-identifiers alone. Differential privacy provides mathematical proof that the aggregate statistics derived from the corpus are ε-indistinguishable regardless of auxiliary data. The synthetic data layer (Layer 5) breaks the direct record-to-person link entirely, since synthetic records have no real-world counterpart. Quarterly adversarial re-identification audits are run by the privacy auditor using the methodology from Narayanan-Shmatikov against a sample of the Layer 4 corpus.
Risk. If the consent UI is too frequent, too optimistic, or defaults to higher tiers, users will click through to Tier C or D without understanding what they are consenting to. Consent under these conditions is not meaningful and may not be legally valid under GDPR Article 7.
Mitigation. Tier A is the permanent default. The consent screen is shown once at account creation and then surfaced only if the user navigates to Privacy Settings. It is never shown during a session. The copy is written at 7th-grade reading level and reviewed by a plain-language specialist. No positive-affect framing ("help people like you") is used. The revocation path is as prominent as the opt-in path. The consent screen is A/B tested for comprehension, not for conversion.
Risk. Silent Infinity currently operates in the wellness category, not the clinical category. HIPAA does not currently apply. However, if the product expands to clinical use cases, integrated with provider-side records, or becomes the subject of enforcement action by a state attorney general, the regulatory environment could change rapidly.
Mitigation. Annual legal review of the applicable regulatory landscape by external counsel with GDPR, CCPA, and HIPAA expertise. A documented "clinical expansion checklist" is maintained that specifies the additional compliance requirements (BAA agreements, minimum-necessary HIPAA standards, expanded DPIA under GDPR Article 35) that would be triggered by each hypothetical expansion scenario. The current architecture is designed to be HIPAA-upgradeable without replatforming: SSE-KMS encryption, audit logging, and access controls already meet the technical safeguard requirements of the HIPAA Security Rule.
Risk. Silent Infinity may open-source components of its crisis detection system (as noted in the product roadmap). If training data inadvertently contaminates the open-source codebase, the privacy guarantees of this document are broken.
Mitigation. Strict architectural separation: the crisis detection module is pattern-rule-based, not model-trained. Its logic is authored by humans and contains no user-derived statistical patterns. The open-source repository has no read access to any S3 bucket that holds conversation data. CI/CD pipelines for the open-source repo are isolated from the private infrastructure accounts. Any proposal to train the crisis detection module on conversation data would require a full pipeline review and separate consent tier.
---
Silent Infinity commits to publishing the following information on an ongoing basis at /safety/privacy/data-use and /safety/privacy/transparency.
Training phase status. The current phase of training (1 through 5, as defined in Section 8) is published and updated whenever a new phase begins. Users can know, at any time, whether and how their data is being used for model training.
ε budget consumption. The total differential privacy budget consumed per year, broken down by use class (analytics, training-data release, research queries), is published in the quarterly transparency report. Individual user budget consumption is not published (that would be identifying), but the aggregate figures allow technical reviewers to verify that our budget claims are consistent with our query volumes.
Model provenance. The model ID currently serving Silent Infinity, its training data provenance (which phase, which corpus versions, which synthetic data generation run), and the date it was deployed are published and updated on model change. Users can know, in technical terms, what trained the model they are talking to.
Data export. Any user may request a machine-readable JSON export of all their data held in Layers 1, 2, and 3 at any time. The export is generated and delivered within 30 days. The export includes: all conversations retained in Bronze tier (up to 90 days), all Sentinel observations linked to their account, and the full contents of their per-user memory. The export format is documented at /safety/privacy/data-export.
Annual external privacy audit. An external privacy auditor (firm to be selected per Section 13) conducts an annual audit of the full data pipeline, the anonymization methodology, the consent architecture, and the differential privacy implementation. The audit findings, including any remediation items, are published in full. No findings are redacted for commercial sensitivity.
---
The following items are policy gaps that require founder-level decisions before they can be resolved. They are listed here for visibility rather than deferred silently.
When to enable the Layer 4 opt-in consent screen. The Layer 4 consent architecture is fully specified in this document but has not been implemented in the product. The decision point is tied to SSO launch, after which users have stable, authenticated identities to which consent records can be durably attached. Estimated timeline: Q3 2026, contingent on SSO completion. Decision needed: confirm or adjust the activation trigger.
External privacy audit budget. Credible external privacy audit firms with health-data and AI experience (OneTrust, KPMG Privacy, Nymity, or boutique academic spin-offs) charge approximately $15,000–$30,000 per annual engagement. This should be budgeted as a compliance operating cost beginning in the fiscal year that Phase 2 training begins. Decision needed: budget allocation and preferred vendor selection process.
ML engineer and clinical ethicist consultants. Phase 2 training (Q4 2026) requires ML engineering capacity to implement the RLHF pipeline and a clinical ethicist to participate in the IRB-style review. Neither is currently on the team. Decision needed: hire, contract, or academic partnership for each role, and timeline for sourcing.
IRB partner selection. Phase 3 training (Q2 2027) requires IRB review. Candidate partners are WCG IRB (commercial, fast turnaround, experienced with digital health), Advarra (commercial, strong clinical trial background), or an academic partnership with a university IRB (slower but adds research credibility and potential co-authorship on any published findings). Decision needed: preferred IRB track and outreach timeline.
---
Aggarwal, C.C. (2005). On k-anonymity and the curse of dimensionality. Proceedings of the 31st International Conference on Very Large Data Bases (VLDB 2005), pp. 901–909.
Anthropic. (2022). Constitutional AI: Harmlessness from AI feedback. arXiv preprint arXiv:2212.08073.
Apple Differential Privacy Team. (2017). Learning with Privacy at Scale. Apple Machine Learning Journal, 1(8).
Carlini, N., Tramer, F., Wallace, E., Jagielski, M., Herbert-Voss, A., Lee, K., Roberts, A., Brown, T., Song, D., Erlingsson, U., & Oprea, A. (2021). Extracting training data from large language models. Proceedings of the 30th USENIX Security Symposium, pp. 2633–2650.
Cavoukian, A. (2010). Privacy by Design: The 7 Foundational Principles. Information and Privacy Commissioner of Ontario.
Dwork, C. (2006). Differential privacy. In Proceedings of the 33rd International Colloquium on Automata, Languages and Programming (ICALP 2006), Lecture Notes in Computer Science vol. 4052, pp. 1–12. Springer.
European Parliament and Council of the European Union. (2016). Regulation (EU) 2016/679 (General Data Protection Regulation). Articles 4, 6, 9, 22, 32, 35. Official Journal of the European Union.
Geiping, J., Bauermeister, H., Dröge, H., & Moeller, M. (2020). Inverting gradients — How easy is it to break privacy in federated learning? Advances in Neural Information Processing Systems (NeurIPS 2020).
Jordon, J., Szpruch, L., Houssiau, F., Bottarelli, M., Belgrave, D., Bhatt, D., Tucker, A., & Flaxman, S. (2022). Synthetic data — what, why and how? Nature Medicine, 28(11), 2256–2258.
Li, N., Li, T., & Venkatasubramanian, S. (2007). t-Closeness: Privacy beyond k-anonymity and l-diversity. Proceedings of the 23rd IEEE International Conference on Data Engineering (ICDE 2007), pp. 106–115.
Machanavajjhala, A., Kifer, D., Gehrke, J., & Venkatasubramanian, M. (2007). l-Diversity: Privacy beyond k-anonymity. ACM Transactions on Knowledge Discovery from Data (TKDD), 1(1), article 3.
McMahan, H.B., Moore, E., Ramage, D., Hampson, S., & Agüera y Arcas, B. (2017). Communication-efficient learning of deep networks from decentralized data. Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS 2017).
Narayanan, A., & Shmatikov, V. (2008). Robust de-anonymization of large sparse datasets. Proceedings of the 2008 IEEE Symposium on Security and Privacy, pp. 111–125.
State of California. (2018). California Consumer Privacy Act (CCPA), Civil Code §1798.100–.199, as amended by the California Privacy Rights Act (CPRA) 2020.
Sweeney, L. (2002). k-Anonymity: A model for protecting privacy. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 10(5), 557–570.
Tucker, A., Wang, Z., Rotalinti, Y., & Myles, P. (2020). Generating high-fidelity synthetic patient data for assessing machine learning healthcare software. NPJ Digital Medicine, 3(1), 147.
---
Document ends. Total estimated word count: 5,800. This document is version-controlled. Next scheduled review: 2026-10-21 (6-month) or upon any Phase transition, whichever comes first.