Date: 2026-04-26
Author: SCOUT
Audience: Harnoor — personal portfolio / learning signal
Target: 8–12 ideas, each shippable in 1 week of evening work
---
These ideas are biased toward Harnoor's actual life: data engineering work, TITAN OS building, shadow work / journaling practice, socializing at Starbucks, and general curiosity about where agentic AI is heading. Each idea teaches a distinct architectural pattern. Each produces a shareable artifact. None requires quitting your day job to finish.
---
What it does:
You dump your Obsidian vault, journal entries, or even exported Notion pages into a pipeline that chunks, embeds, and stores them in Qdrant. A simple Streamlit or Gradio UI lets you ask questions like "what did I think about setting boundaries last November?" or "summarize every note I wrote about focus." The trick is that it uses hybrid retrieval — BM25 keyword search combined with dense vector search — and then runs Cohere Rerank (or a local BGE cross-encoder) as a second stage to re-order the top-20 chunks to top-5 before feeding GPT-4.1 or Claude 3.5 Sonnet. The result is that vague, emotionally-toned queries that would normally get noise back actually return the right journal entry.
Tech Stack:
text-embedding-3-small or nomic-embed-text for embeddingsrerank-english-v3.0) or local BAAI/bge-reranker-base via sentence-transformers1-Week Scope:
Demo Angle:
Side-by-side video: same 5 queries, naive RAG vs. reranked RAG, citations shown. The accuracy gap is visually obvious. Tweet with the eval spreadsheet screenshot.
Why it teaches an agentic pattern:
RAG + Rerank is the foundational "retrieve broadly, rank precisely" pattern. Understanding why reranking exists and how cross-encoders see query+document jointly (vs. bi-encoder dot product) is the bedrock all production RAG systems are built on. The eval step teaches LLM-as-judge methodology.
Monetization angle:
Wrap it as a $10/month personal knowledge SaaS for Obsidian/Notion power users. The core pain (search that understands context, not just keywords) is real and underserved.
---
What it does:
A tool-calling agent that acts as a light date / hangout planner. You tell it a city, a vibe, and a budget. It calls a Yelp Fusion or Foursquare Places API to get real venue options, a weather API for the weekend forecast, and optionally Google Calendar to check your free windows. It then writes out a 3-option plan with specific venues, time slots, estimated costs, and a short narrative rationale for each. No hallucinated place names — every venue comes from a live API response. The agent uses ReAct-style reasoning: Thought → Action (tool call) → Observation → next Thought loop until a plan is assembled.
Tech Stack:
1-Week Scope:
Demo Angle:
Live screen recording: type "coffee and evening walk for two, Edmonton, Saturday, ~$30" → agent reasons aloud via streaming output → returns three specific plans with real venues. Post the video. The narrated reasoning trace is the hook.
Why it teaches an agentic pattern:
This is pure tool-use / ReAct. You build intuition for how the model decides when to call a tool vs. answer from memory, how tool outputs get injected back into context, and what happens when a tool errors mid-chain. These are the first two things that break in any production agent.
Monetization angle:
Productize as a "date night planner" consumer app. Charge $3/plan or $5/month subscription. The novelty is specificity (real venues + weather) vs. generic ChatGPT advice.
---
What it does:
An n8n workflow fires every night at 10 PM. It pulls your calendar events from Google Calendar for that day (what meetings happened, what commitments you made), sends them plus a rotating Jungian shadow work prompt to Claude via the Anthropic node, and Claude writes a personalized evening reflection question set tailored to that specific day. The questions and any notes you write back (via a simple Telegram bot reply or email reply) get stored in Notion. Once a week, a second n8n workflow runs a "weekly pattern synthesis" — it reads the last 7 days of Notion entries, asks Claude to identify emotional patterns, avoidance behaviors, or recurring tensions, and emails you a Sunday digest. This closes the loop between daily behavior and long-range self-knowledge.
Tech Stack:
1-Week Scope:
Demo Angle:
Screenshot series of a real week's Notion database + Sunday digest email. Blur anything sensitive. The X post angle: "I automated my shadow work journaling practice — here's what the AI noticed about my week."
Why it teaches an agentic pattern:
This is the event-trigger → context-gather → LLM-generate → action-write pattern — the backbone of all personal AI automations. It also introduces scheduled multi-step n8n orchestration and teaches the difference between stateless generation (nightly prompt) and stateful synthesis (weekly digest reads prior outputs as context).
Monetization angle:
Package as a $9/month "AI journaling coach" SaaS. The differentiation is personal (your actual calendar) vs. generic journaling apps. Shadow work + Jungian angle has a passionate niche audience.
---
What it does:
You paste in a job description for any data engineering or AI role. A Planner agent (GPT-4.1) reads the JD and produces a structured research plan: what company facts to gather, what technologies appear that you should verify, what competitive info matters, what questions you might be asked. It then spawns three Executor sub-agents in parallel — one scrapes the company's engineering blog via LlamaIndex WebPageReader and builds a mini-RAG over it, one searches LinkedIn for recent team growth signals, one checks the tech stack against Stack Overflow trends. A Synthesizer agent combines all outputs and produces a two-page "interview prep brief." The entire pipeline is driven by LangGraph with explicit state nodes for planning, parallel execution, and synthesis.
Tech Stack:
1-Week Scope:
Demo Angle:
Video: paste a data engineering JD at a recognizable company → 90 seconds later → 2-page interview brief with real facts. Post on LinkedIn "Built a multi-agent job intel tool over a weekend." Tag it with the LangGraph state graph screenshot — those get engagement.
Why it teaches an agentic pattern:
This is the Planner + Parallel Executor pattern — the highest-leverage agentic architecture for research tasks. You learn how to decompose a goal into parallel branches, manage state across agent hand-offs, and handle partial failures (one executor fails, others proceed).
Monetization angle:
Sell as a $19 one-time "instant interview brief" tool. Target job-seekers on Reddit (r/cscareerquestions). High willingness to pay in that context.
---
What it does:
A custom MCP server that exposes three tools to any Claude or Cursor session: find_third_place, recall_social_context, and suggest_opener. It stores your own notes about places you frequent (vibe, crowd type, best times, people you've met) in a pgvector collection. When you invoke find_third_place("quiet + social Starbucks Edmonton midday") it does semantic search over your personal notes + enriches with live Foursquare data. When you invoke recall_social_context("name: Jordan, met at Whyte Ave") it surfaces anything you've saved about that person. The suggest_opener tool uses your recalled context to generate a contextually appropriate opening line for reconnecting. This is less about automation and more about building and exposing your own personal knowledge graph as MCP tools.
Tech Stack:
text-embedding-3-small for embeddings1-Week Scope:
find_third_place tool — embed query → pgvector search → Foursquare enrichmentrecall_social_context tool — ingest and query personal people/place notessuggest_opener tool — context assembly → LLM generationDemo Angle:
GIF in Claude Desktop: type find_third_place("calm, good for focused conversations, Saturday afternoon") → server returns ranked places with your personal notes + live reviews. The MCP server is the artifact — post it to GitHub and the MCP directory.
Why it teaches an agentic pattern:
Building an MCP server teaches you how AI tools are exposed and consumed — the plumbing underneath every production Claude or Cursor integration. This is the most forward-looking skill in the list: MCP is the standard interface layer for agentic tool ecosystems in 2026.
Monetization angle:
Not directly — but MCP server authorship is a strong portfolio signal. Open-source it with a "Pro" version that syncs across devices via a hosted API.
---
What it does:
You build a RAG assistant trained on your company's or personal data engineering documentation — dbt model docs, Spark job READMEs, Airflow DAG comments, internal runbooks. It answers questions like "what does this DAG do on failure?" or "which dbt model is the source of truth for revenue?" The key differentiator is that you also build a proper LLM-as-judge eval harness: a golden test set of 30 (question, expected-answer) pairs, and an automated eval script that scores every retrieval change against it. When you add a new doc or tweak chunking, you immediately see whether accuracy went up or down. This makes it a real engineering project, not just a RAG demo.
Tech Stack:
text-embedding-3-small or nomic-embed-text1-Week Scope:
Demo Angle:
Post the eval results chart showing accuracy improving from 60% to 85% across 3 iterations. The narrative: "RAG without evals is vibes. Here's what systematic eval looks like." This is the most credible data engineering portfolio signal in the list.
Why it teaches an agentic pattern:
Evals are the pattern most people skip and most hiring managers care about. This teaches systematic LLM-as-judge scoring, the meaning of faithfulness vs. correctness, and how to treat RAG as an engineering discipline rather than a magic trick.
Monetization angle:
Direct path: offer internal documentation RAG setup as a $500–2,000 freelance service for small data teams. The eval harness is what separates you from every other "I can build RAG" person.
---
What it does:
You describe a decision you're weighing — a career move, a relationship choice, a business bet. A Proposer agent writes the strongest possible case FOR the decision. A Critic agent writes the strongest possible case AGAINST it. A Mediator agent reads both and identifies where the Proposer was wishful-thinking and where the Critic was unfairly catastrophizing. Then a final Synthesis agent produces a "decision brief" with: the 3 strongest real arguments on each side, the key unknowns that would change the analysis, and a recommended decision process (not a decision — it won't decide for you). Built on LangGraph with a cyclic debate loop that runs 2 rounds before synthesizing.
Tech Stack:
1-Week Scope:
Demo Angle:
Record yourself using it live on a real decision (can be opaque — "should I take Project X"). Show the full multi-agent debate transcript. The tweet: "I built a multi-agent system that argues both sides of every decision I can't make alone." That headline alone has viral potential.
Why it teaches an agentic pattern:
Multi-agent debate / reflection is the Reflexion + adversarial critic pattern. You learn how agents can improve output quality through structured disagreement, how to manage cyclic state in LangGraph, and how to design prompt personas that produce genuinely different perspectives rather than polite variations.
Monetization angle:
Genuinely strong. "AI decision coach" is a category people pay $50–200/hr for from human coaches. A $15/month SaaS with decision history has a real audience among high-agency professionals.
---
What it does:
A standalone voice interface for a subset of TITAN capabilities — specifically: asking what you need to focus on today, logging a quick thought or memory, and asking "what did I decide about X?" The pipeline is: local wake word detection (openWakeWord) → Whisper transcription → Claude Haiku with injected TITAN memory context → gpt-4o-mini-tts streaming audio response → plays back through speakers. The entire round trip from "hey TITAN" to spoken response targets under 2 seconds. Runs on a Raspberry Pi 4 or just a laptop as a persistent background process.
Tech Stack:
gpt-4o-transcribe) or local whisper-base for under 1s latencygpt-4o-mini-tts for TTS1-Week Scope:
Demo Angle:
Video: walking around the apartment, say "hey TITAN, what did I decide about the data pipeline refactor?" → 1.8 seconds → spoken answer referencing an actual TITAN memory note. This is the most visceral demo in the list. Post on X — voice + AI + memory = high engagement.
Why it teaches an agentic pattern:
Voice agents teach streaming architecture (chunk-by-chunk audio generation), low-latency pipeline design, and multimodal orchestration (audio in → text → audio out). The wake word integration teaches always-on ambient agent patterns that are increasingly relevant as AI shifts to background presence.
Monetization angle:
Build it as a TITAN feature first. Later: package the wake-word-to-voice-agent skeleton as a $49 open-source starter kit on Gumroad.
---
What it does:
An n8n workflow wires into your GitHub (or GitLab) via webhooks. Every night it pulls your commits, PR comments, and issue updates from the last 24 hours, passes them through a LlamaIndex summarizer that chunks and vectorizes the diffs, and asks Claude to generate a crisp standup update in your voice: what you shipped, what's blocked, what's next. At 9 AM it posts the standup to Slack (or Discord) automatically. A separate n8n trigger lets teammates ask "what did Harnoor work on this week?" via Slack slash command, which hits the same vector index and returns a cited summary.
Tech Stack:
1-Week Scope:
Demo Angle:
Screenshot of a week of auto-generated standups in Slack. Compare to your actual standup style — show the match. LinkedIn post: "I haven't written a standup in 5 days." Strong engagement from engineering audiences.
Why it teaches an agentic pattern:
This is the event-driven n8n trigger → RAG summarization → scheduled output pattern — the most practical n8n + AI integration for working engineers. It also teaches webhook-driven agent activation vs. user-initiated activation.
Monetization angle:
Sell as a $15/month "smart standup" Slack app to small dev teams. The market is everyone who resents writing standups.
---
What it does:
An n8n workflow polls your Gmail inbox every 30 minutes. Each new email is embedded and stored in Qdrant. A priority-classification agent tags every email across four categories — urgent-personal, urgent-work, newsletters, and noise — using a ReAct loop that can call a "search past emails" tool to check if a sender has been important before. A daily digest n8n workflow fires at 7 AM and runs RAG over today's emails to produce a "here's what matters today and why" briefing, ordered by importance, with a one-sentence summary of each key email. A second pass generates suggested draft replies for the top 3. You review and send manually — the agent never sends.
Tech Stack:
text-embedding-3-small1-Week Scope:
Demo Angle:
Screenshot of the 7 AM digest for a real-looking inbox with 40 emails reduced to 5 paragraphs + 3 draft replies. Tweet: "My inbox became an agent's problem." The draft reply feature is the money shot.
Why it teaches an agentic pattern:
This is a production-grade combination of continuous event ingestion, semantic memory (Qdrant as persistent email knowledge base), and ReAct tool use. The "search past emails" tool is a genuine long-term memory implementation — the hardest part of any personal AI assistant.
Monetization angle:
Email AI is an extremely crowded market but personal self-hosted versions with real RAG history have a devoted niche. The open-source repo alone will get GitHub stars.
---
What it does:
You describe your research interests (e.g., "agentic RAG, multi-agent debate, LLM evals, vector database scaling"). An n8n workflow hits the ArXiv API every morning, pulls papers matching those interests from the last 48 hours, embeds the abstracts and introductions into Qdrant, and runs a semantic dedup to filter papers you've already seen. A LlamaIndex query engine lets you ask cross-paper questions: "which of this week's papers address the latency problem in RAG?" A daily email digest ranks the top 5 papers with a 3-sentence plain-English summary of each. The system accumulates a rolling 60-day paper index, so you can ask "what have people been doing about retrieval quality this month?"
Tech Stack:
text-embedding-3-small1-Week Scope:
Demo Angle:
Show the daily email for a real day in your interests. Then screen-record asking the cross-paper question "what are papers this week saying about RAG latency?" The accumulating research memory angle is distinctive.
Why it teaches an agentic pattern:
This is scheduled ingestion → persistent vector memory → query-over-time, which is the research-assistant pattern underlying systems like Elicit and Consensus. It teaches time-scoped RAG (filtering by ingestion date) and semantic dedup — both production-critical skills.
Monetization angle:
$12/month "AI research digest" newsletter tool. Target PhD students and AI practitioners who can't keep up with ArXiv volume. Low customer acquisition cost via Reddit + Twitter.
---
What it does:
You paste in a LinkedIn URL or company name. An agent pipeline scrapes their public LinkedIn (via RapidAPI LinkedIn scraper), fetches their last 3 blog posts via LlamaIndex WebPageReader, and pulls their last 5 tweets if available. All sources go into an ephemeral Chroma index. A "personalization agent" reads the index and extracts 3–5 specific, non-obvious facts about the person (a recent project, a stated belief, an accomplishment they seem proud of). A "message composer" agent writes a cold DM or email that references those facts authentically — not "I saw you work at X" but "I noticed in your post about Y you mentioned Z — that matched something I've been thinking about." An n8n workflow automates the pipeline: paste URL into a webhook form → draft appears in Notion 90 seconds later.
Tech Stack:
1-Week Scope:
Demo Angle:
Show 3 side-by-side examples: generic outreach vs. agent-generated personalized outreach. The quality gap is the tweet. "I stopped writing cold messages by hand." Strong engagement in sales/BD/recruiting circles.
Why it teaches an agentic pattern:
This is multi-source RAG with ephemeral context (index lives only for that run), sequential agent chaining (extract facts → compose message), and webhook-triggered n8n orchestration. It also introduces the constraint problem: making an LLM produce genuinely personalized output without sounding sycophantic is harder than it looks and requires careful prompt engineering.
Monetization angle:
Clearest monetization in the list. Cold outreach tools charge $50–200/month (Apollo, Lemlist, etc.). A personalized-by-RAG differentiator with a privacy-first self-hosted story has real positioning.
---
| # | Idea | Primary Agentic Pattern |
|---|------|------------------------|
| 01 | Second Brain Search | RAG + Rerank (retrieve broadly, rank precisely) |
| 02 | Date Planner | Tool Use / ReAct loop |
| 03 | Nightly Reflection | n8n event trigger → LLM → persistent write |
| 04 | Job Intel Agent | Planner + Parallel Executor (LangGraph fan-out) |
| 05 | Social Radar MCP | MCP server authorship / tool exposure |
| 06 | Data Pipeline Copilot | RAG + LLM-as-Judge Evals |
| 07 | Debate My Decision | Adversarial multi-agent reflection / Reflexion |
| 08 | Voice-First TITAN Pocket | Streaming voice agent / ambient AI |
| 09 | Smart Standup Bot | Webhook-triggered RAG summarization |
| 10 | Semantic Inbox Zero | Continuous ingestion + long-term semantic memory |
| 11 | RAG Paper Buddy | Scheduled ingestion + time-scoped persistent RAG |
| 12 | Cold Outreach Personalizer | Multi-source ephemeral RAG + sequential agent chaining |
---
1. Idea 03 (Nightly Reflection) — lowest build cost, highest daily personal value, most relatable tweet
2. Idea 01 (Second Brain Search) — directly feeds TITAN, teaches the most durable RAG skill
3. Idea 07 (Debate My Decision) — strongest viral demo potential, directly applicable to the high-stakes decisions you're navigating right now
---
---
Subject: [SCOUT] Personal AI project ideas — RAG + n8n + agentic (1-week each)
Body:
Harnoor,
SCOUT completed a full research pass on current agentic AI patterns, n8n automation capabilities, vector DB options, and MCP server tooling. Below is the briefing.
12 project ideas landed, each scoped to one week of evenings, each teaching a different agentic pattern. Full memo saved at F:/TITAN/plans/advisors/PERSONAL-AI-IDEAS-RAG-N8N-2026-04-26.md.
The 5 things worth knowing:
1. Every idea maps to a distinct architectural pattern (reranking, tool-use, n8n triggers, planner+executor, MCP server, evals, adversarial reflection, voice streaming, webhook RAG, persistent semantic memory, time-scoped ingestion, ephemeral multi-source RAG). No overlap. Each one teaches something different.
2. Top 3 to start: Nightly Reflection Loop (n8n + shadow work, lowest friction to ship), Second Brain Search (Obsidian RAG with reranking, feeds TITAN directly), and Debate My Decision (multi-agent adversarial loop, highest X engagement potential).
3. MCP server idea (05 — Social Radar) is the most forward-looking. Building your own MCP server and publishing it to the MCP directory is the highest-signal portfolio move right now given how fast Claude + Cursor integrations are growing.
4. Eval harness (idea 06) is the one that makes you look like a senior engineer rather than a RAG hobbyist. Hiring managers notice when you treat retrieval as an engineering discipline with measurable benchmarks.
5. Three clear monetization angles: Second Brain Search (~$10/month Obsidian SaaS), Debate My Decision (~$15/month decision coach), Cold Outreach Personalizer (~$50/month B2B tool).
Full detail — stack, scope, demo angle, why it matters — is in the memo.
— SCOUT
---
Sources consulted: