Work POC Ideas — Kafka + Iceberg + Lakehouse Stack

Prepared by SCOUT | 2026-04-26 | ~2-week scopes each

---

TL;DR (5 bullets)

1. Stream-to-Iceberg in a day — the Flink Dynamic Iceberg Sink (GA in Iceberg 1.11 / Flink 1.20) handles schema evolution automatically; a 2-week POC on your most volatile Kafka topic produces a concrete latency and freshness benchmark.

2. Compaction is your fastest demo win — CDC workloads generate delete files exponentially; showing a 5–10x query speedup after scheduled bin-pack compaction is a slide that writes itself.

3. WAP (Write-Audit-Publish) is the "safe deploys for data" story — Iceberg branches let you gate data quality like a PR gate; this maps directly to compliance asks and resonates with infra engineers.

4. DuckDB + Iceberg as a local dev loop — engineers querying production Iceberg tables on a laptop in <30 s replaces a class of ad-hoc Spark jobs; the demo is visceral and the win is instant.

5. dbt + Iceberg incremental models cut Spark costs — replacing full-scan batch jobs with merge incremental models is measurable in compute hours and shows up on a cost dashboard within the 2-week window.

---

POC 1 — "Zero-Restart Schema Evolution Pipeline"

Problem Statement

Production Kafka topics change schemas constantly — new fields, renamed columns, type widening — and today every schema change requires a job restart, a schema migration meeting, and at least one incident. The Flink Dynamic Iceberg Sink (shipped in November 2025 on Apache Flink, promoted to @PublicEvolving in Iceberg 1.11) can consume Avro/JSON events from a Confluent Schema Registry and automatically evolve the target Iceberg table schema in flight, with zero downtime. Your team likely has at least one "fragile topic" that blocks onboarding new fields. This POC makes that topic self-healing.

Tech Stack

Apache Kafka (any 3.x broker) + Confluent Schema Registry 7.x
Apache Flink 1.20 + flink-iceberg-runtime-1.20 (Iceberg 1.7+)
Iceberg 1.7 or 1.8 (REST catalog — Polaris, Lakekeeper, or Nessie)
Target storage: S3-compatible (MinIO locally, or your cloud bucket)

2-Week Scope

Week 1: Stand up a local Flink cluster (Docker Compose), connect it to your team's dev Kafka cluster, and write a first-version Flink job using IcebergDynamicSink targeting a REST catalog. Confirm basic ingestion works and check Iceberg metadata after first commit.

Week 2: Deliberately evolve the Avro schema (add a nullable field, widen an int to long), observe the sink handle it without restart. Measure commit latency, file count per checkpoint interval, and metadata overhead vs. a static sink. Document the before/after story.

What does NOT fit: multi-tenant fan-out (more than 2–3 tables), production traffic — dev topic only.

Demo Angle

Live standup demo: two terminal panes side by side. Left: a Python producer publishing events with schema v1, then v2, then v3 over 90 seconds. Right: a Trino query showing the Iceberg table schema update in real time. No restarts. The room goes quiet.

Stretch Goals

Add a dead-letter queue branch (Iceberg branch dlq) for records that fail schema validation
Benchmark against a static Kafka Connect Iceberg sink for commit latency comparison
Write a blog post for the Apache Flink community site

Why It Teaches You Something

Schema evolution is the #1 operational pain in streaming pipelines. Demonstrating you can solve it at the sink layer — not by freezing schemas in place — signals senior data engineer judgment. This also hands-on covers Flink SinkV2 internals, which is increasingly tested in staff-level DE interviews.

---

POC 2 — "CDC Compaction Watchdog"

Problem Statement

CDC pipelines (Postgres, MySQL, or Oracle to Kafka via Debezium) produce equality-delete files at a rate of roughly 1 per update event. At 1,000 updates/second that is 86 million delete files per day. Without compaction, a query that takes 2 seconds on day 1 takes 45 seconds by week 3. Your team either has this problem now or will have it within a quarter of scaling CDC. This POC instruments the problem, demonstrates the degradation, and then automates the fix: scheduled bin-pack compaction with Spark's RewriteDataFiles action, measuring query latency before and after on a real table.

Tech Stack

Debezium 2.x → Kafka 3.x → Kafka Connect Iceberg Sink Connector (or Flink CDC 3.4)
Apache Iceberg 1.7 REST catalog
Apache Spark 3.5 for RewriteDataFiles compaction action
Trino 475+ for query benchmarking
(Optional) Apache Amoro / Project Nessie for automated compaction scheduling

2-Week Scope

Week 1: Set up a Postgres → Debezium → Kafka → Iceberg CDC pipeline on a dev table. Deliberately run 24–48 hours of CDC traffic without compaction. Export Trino query latency metrics and Iceberg metadata (manifest file count, delete file count) at regular intervals to show the degradation curve.

Week 2: Implement scheduled compaction (Spark job, cron, or Amoro catalog-level). Run the same query benchmarks post-compaction. Chart the before/after: manifest count, delete file count, and p50/p95 query latency. Add expire_snapshots to keep metadata lean.

What does NOT fit: automated adaptive compaction (Amoro ML-based scheduling) — that is a month-long project.

Demo Angle

A single slide with two charts: (1) query latency over time without compaction (slopes upward), and (2) the same metric after compaction fires (drops back to baseline). Title it "We fixed the ticking clock." Optional: show the Iceberg metadata diff before/after in Trino's $manifests and $files system tables.

Stretch Goals

Add a Grafana dashboard tracking delete-file accumulation rate as a SLO metric
Contribute a Helm chart or Airflow DAG for the compaction job to your team's DE toolkit
Benchmark copy-on-write vs. merge-on-read write modes for your specific CDC workload

Why It Teaches You Something

Iceberg table maintenance (compaction, snapshot expiry, orphan file cleanup) is an operational gap at most companies. Owning the "compaction story" makes you the person who keeps the lakehouse healthy. This skill is directly transferable to any team running Iceberg at scale.

---

POC 3 — "Write-Audit-Publish: Git-Like Data Quality Gates"

Problem Statement

When a pipeline pushes bad data to a production table, rollback is painful, downstream dashboards are poisoned, and the RCA takes days. Iceberg's branching model (introduced in Iceberg 1.2, matured in 1.7) enables a Write-Audit-Publish (WAP) workflow: new data lands on an audit branch, dbt data quality tests (or Great Expectations assertions) run against it, and only after a clean test run does the branch fast-forward to main. This is the data equivalent of a PR gate, and it can be demonstrated end-to-end in two weeks using tooling your team already runs.

Tech Stack

Apache Iceberg 1.7 (branch/tag API)
dbt-spark or dbt-trino adapter (latest stable, April 2026)
Great Expectations 1.x (optional, for richer assertions)
Apache Spark 3.5 or Trino 475+ for branch operations
Airflow 2.9+ (or GitHub Actions) to orchestrate the WAP loop

2-Week Scope

Week 1: Pick one existing dbt model that touches a high-stakes table. Modify the write path to target wap-branch instead of main. Add 3–5 dbt tests (not-null, accepted-values, row-count delta check). Wire a simple orchestrator that runs tests after each load and only calls fast_forward('main', 'wap-branch') on a green run.

Week 2: Deliberately inject bad data (null primary keys, out-of-range values). Show the branch catches it before main is affected. Query main during the bad run — it returns the last clean snapshot. Document the rollback story: time-to-recovery drops from "hours of debugging" to "branch rollback in seconds."

What does NOT fit: multi-branch fan-out per environment, row-level lineage — keep scope to one table and one branch.

Demo Angle

Two Trino queries running in parallel during the demo: one reads main (always clean), one reads wap-branch (shows the bad batch). After the orchestrator rejects the bad batch, the branch resets. Show the Iceberg snapshot log — the failed run leaves zero trace on main.

Stretch Goals

Extend to a "staging → audit → main → archive" multi-branch pipeline
Add row-count anomaly detection (compare to trailing 7-day average) as an automated gate
Write an ADR (Architecture Decision Record) for adopting WAP as a team standard

Why It Teaches You Something

WAP is mentioned in virtually every modern data quality talk (Data + AI Summit, Current, dbt Coalesce). Knowing the implementation details — Iceberg branch API, fast_forward, snapshot isolation — puts you well ahead of candidates who only know the concept from a slide deck.

---

POC 4 — "DuckDB Local Lakehouse: Query Production Iceberg in 30 Seconds"

Problem Statement

Your team's analysts and engineers currently run ad-hoc Spark jobs or stand up a Trino cluster to query production Iceberg tables — a process that takes 5–15 minutes just to get a cluster warm. DuckDB 1.4 (released late 2025) added native Iceberg write support and can read Iceberg REST catalogs directly from a laptop, running queries 10–100x faster than Spark on datasets under ~500 GB. The POC shows engineers how to swap their entire "spin up a Spark shell" workflow for a DuckDB Python script that returns results in under 30 seconds, directly from your production catalog.

Tech Stack

DuckDB 1.4+ with iceberg extension
Apache Iceberg REST catalog (Polaris / Nessie / your existing catalog)
Polars 1.x (as a post-processing and visualization layer)
Python 3.12 (Jupyter or plain script)
(Read-only connection to production — no writes in the POC)

2-Week Scope

Week 1: Install DuckDB locally, configure the iceberg extension to point at your catalog's REST endpoint. Write 10 representative ad-hoc queries that engineers currently run on Spark. Benchmark wall-clock time: Spark (cluster cold) vs DuckDB (local). Capture both p50 and worst-case times.

Week 2: Add Polars as a downstream layer for anything that needs Python-native transformation (joins against local CSVs, output to Excel for stakeholders). Write a one-page "DuckDB Quick-Start" cheatsheet. Present the benchmark table at sharedown.

What does NOT fit: writing back to production Iceberg (read-only POC), datasets >500 GB (Spark's turf), cluster-scale aggregations.

Demo Angle

Open a Jupyter notebook live. Run the same query against the Spark cluster (start the clock while it warms), then run it in DuckDB. DuckDB finishes before Spark's cluster is even ready. The benchmark table shows 10 queries: Spark average 4.5 min, DuckDB average 22 s. That slide goes into every "why we're modernizing" deck for the next year.

Stretch Goals

Package as a pip install titan-lakehouse-client internal library with catalog config baked in
Add pg_duckdb to let engineers query Iceberg via standard Postgres clients
Benchmark DuckDB vs Trino for the same query set (different sweet spots, interesting findings)

Why It Teaches You Something

DuckDB is rapidly displacing Spark for interactive workloads. Understanding where each tool fits — DuckDB for interactive/exploratory, Spark for cluster-scale ETL, Flink for streaming — is exactly the mental model senior engineers are expected to have in 2026 interviews.

---

POC 5 — "Iceberg Incremental dbt Models: Kill the Full-Scan Batch Job"

Problem Statement

Your team almost certainly has at least one dbt model that runs a full table scan every hour or every night, even though only 0.1% of the rows changed. On a large Iceberg table this is both slow and expensive. The dbt-spark and dbt-trino adapters support incremental_strategy='merge' for Iceberg tables using MERGE INTO, which only processes new or changed rows identified by a unique_key. This POC takes one high-cost full-scan model, converts it to an incremental Iceberg model, measures the compute reduction, and documents the pattern for the team.

Tech Stack

dbt-spark 1.8+ or dbt-trino 1.8+ with Iceberg adapter
Apache Iceberg 1.7 (MERGE INTO support)
Apache Spark 3.5 or Trino 475+
(Optional) Elementary or re_data for dbt model observability

2-Week Scope

Week 1: Identify the top 3 most expensive full-scan dbt models (by Spark job duration or Trino query cost). Pick one to convert. Write the incremental version with unique_key, incremental_strategy='merge', and on_schema_change='append_new_columns'. Run both versions on the same dataset and capture runtime, bytes scanned, and cluster-hours consumed.

Week 2: Validate data correctness (compare row counts, checksums). Handle the edge case of late-arriving data (use a lookback window: where updated_at >= current_timestamp - interval '2' hour). Document the pattern and present a cost projection for converting the other high-cost models.

What does NOT fit: converting all models at once, type-2 SCD logic (more complex merge semantics required).

Demo Angle

Side-by-side runtime comparison: full-scan model = 38 minutes, 2.1 TB scanned. Incremental model = 4 minutes, 180 GB scanned. If your team tracks cloud compute costs, show the dollar figure. The cost delta is the demo.

Stretch Goals

Add Elementary dbt package to track model run times over time automatically
Convert the top 5 expensive models and project annual compute savings
Open-source the incremental strategy pattern as a dbt package macro

Why It Teaches You Something

Cost-aware data engineering is a first-class interview topic in 2026. Being able to say "I reduced our Spark compute cost by 8x by migrating full-scan models to Iceberg incremental merge" with a concrete number is worth three rounds of LeetCode prep.

---

POC 6 — "Kafka Offset-to-Snapshot Lineage Dashboard"

Problem Statement

When a Flink or Spark Structured Streaming job reads from Kafka and writes to Iceberg, the link between a Kafka offset and an Iceberg snapshot is invisible. If a consumer reports missing data, you cannot answer: "which Iceberg snapshot corresponds to Kafka offset X on partition Y?" Iceberg 1.7's snapshot properties allow storing arbitrary metadata per commit, including the Kafka offset ranges consumed. This POC instruments your Flink-to-Iceberg pipeline to write per-snapshot Kafka offset metadata, then builds a simple Trino query or Grafana panel that shows the offset-to-snapshot map in real time.

Tech Stack

Apache Flink 1.20 + Iceberg 1.7 (custom SnapshotSummary properties)
Trino 475+ (query $snapshots system table)
Grafana 11.x + Trino data source plugin (or a simple Streamlit dashboard)
Kafka 3.x (consumer group offset API)

2-Week Scope

Week 1: Extend your existing Flink-to-Iceberg job (or build a toy pipeline) to write Kafka topic, partition, min_offset, and max_offset as snapshot summary properties on every Iceberg commit. Query the $snapshots table in Trino to confirm properties are landing correctly.

Week 2: Build a Grafana panel (or Streamlit script) that joins Kafka consumer group lag (from Kafka JMX/REST) with Iceberg snapshot timestamps to answer: "as of 14:32 UTC, we had processed offsets 0–41,823 on partition 0." Add a simple alert threshold for lag > N minutes.

What does NOT fit: full data lineage (column-level) — that is a month-long OpenLineage project.

Demo Angle

Show the audit query: "find all Iceberg snapshots that consumed Kafka messages between 14:00 and 14:05 UTC." Return snapshot IDs, offsets, commit times, row counts. This answers a question that currently requires manual log-digging. The person on-call in the room will immediately want this.

Stretch Goals

Emit lineage events to OpenLineage/Marquez for a full data lineage graph
Integrate with PagerDuty: alert when lag exceeds SLA and snapshot timestamp diverges from wall clock
Publish as a Flink operator pattern on the Apache Flink blog

Why It Teaches You Something

Observability is the gap between "it works" and "it works at 3 AM." Building offset-to-snapshot lineage demonstrates that you think about operational reliability, not just correctness. It signals readiness for a staff or principal engineer role.

---

POC 7 — "Multi-Catalog Federation: Query Iceberg + Hive + Postgres in One Trino Query"

Problem Statement

Most data teams have data in at least three places: a legacy Hive metastore, a modern Iceberg REST catalog, and operational Postgres databases. Analysts currently write three separate queries and join in Python — which is slow, error-prone, and opaque to governance tools. Trino's multi-catalog connector model can join across all three in a single SQL query, with Iceberg partition pruning on the lakehouse side. This POC configures a Trino cluster (or the free Starburst Galaxy trial) to federate across your catalogs and demonstrates a cross-catalog join that replaces a multi-step pipeline.

Tech Stack

Trino 475+ (or Starburst Galaxy free tier)
Iceberg REST catalog (Nessie or Polaris) as iceberg catalog in Trino
Hive Metastore connector (existing tables)
PostgreSQL connector (operational DB)
(Optional) dbt-trino for modeling the federated result into a clean Iceberg table

2-Week Scope

Week 1: Stand up Trino locally (Docker) or use Starburst Galaxy. Configure 3 catalogs: iceberg, hive, postgres. Write a 3-way join query that pulls from all three. Measure query execution plan — verify Iceberg partition pruning is working (check $partitions in the explain plan).

Week 2: Wrap the federated query as a dbt-trino model that materializes the result as a new Iceberg table. Schedule it in Airflow. Benchmark: federated Trino query vs. the current "extract to S3, join in Spark" approach. Document the cases where federation wins and where it does not (network I/O to Postgres at scale).

What does NOT fit: replacing your Postgres OLTP with Trino writes, row-level security (multi-week project).

Demo Angle

A single SQL query in the Trino UI that joins iceberg.prod.orders, hive.legacy.customers, and postgres.ops.support_tickets and returns a unified view. No Python. No intermediate S3 dump. The stakeholders who maintain those three systems will immediately understand the value.

Stretch Goals

Add Iceberg column statistics (ANALYZE TABLE) to improve the join cardinality estimates
Explore Trino's MERGE into Iceberg from the federated result (upsert pattern)
Write a Starburst/Trino community blog post on the multi-catalog pattern

Why It Teaches You Something

Federation is an increasingly common architecture pattern as organizations move from monolithic warehouses to multi-system lakehouses. Knowing the practical limits (Postgres connector is not partition-aware; Hive connector cannot prune below partition level) makes your architecture recommendations credible.

---

POC 8 — "Hot/Cold Tiering: Automatic Iceberg Partition Migration to Cheaper Storage"

Problem Statement

Object storage is cheap but not free, and access patterns are highly skewed — last 30 days of data gets 95% of queries, everything older is cold. Right now your team either keeps everything in the same hot-tier bucket (expensive) or manually archives to Glacier (operationally painful, invisible to query engines). Iceberg's partition spec combined with a scheduled Spark job can automatically migrate partitions older than N days from S3 Standard to S3 Glacier Instant Retrieval (or MinIO tiered storage locally), while keeping the table queryable without any schema change. This POC demonstrates automated hot/cold tiering with measurable storage cost reduction.

Tech Stack

Apache Iceberg 1.7 (partition evolution, rewrite_position_delete_files)
Apache Spark 3.5 (RewriteDataFiles with target storage config)
AWS S3 Storage Lens or MinIO Console for cost metrics
Airflow 2.9+ for scheduling the tiering job
Trino 475+ to verify queries still work post-migration

2-Week Scope

Week 1: Instrument a test Iceberg table with day-level partitioning. Write a Spark job that identifies partitions older than 30 days and rewrites them with an S3 storage class tag (GLACIER_IR). Verify Trino can still query both tiers seamlessly (it can — the metadata layer is unchanged).

Week 2: Schedule the tiering job in Airflow with a configurable cold_after_days parameter. Measure storage cost before and after for the test table. Build a simple cost projection for the full dataset. Document the runbook.

What does NOT fit: automated query routing by tier (Trino does not natively reroute based on storage class), deep archive (restore latency makes it unsuitable for ad-hoc queries).

Demo Angle

Show an S3 console (or MinIO dashboard) with two storage classes side by side. Run a Trino query that spans both hot and cold partitions — same result, same latency for recent data. Show the projected monthly storage cost reduction (even 20–30% on a large table is a meaningful number).

Stretch Goals

Implement a "warm-on-access" pattern: detect queries hitting cold partitions and trigger a background restore
Add a Grafana cost dashboard that tracks storage tier distribution over time
Contribute the tiering Airflow DAG to your team's internal DE library

Why It Teaches You Something

FinOps for data platforms is a growing discipline. Being able to design and implement storage tiering — rather than just talking about it — is a differentiator, especially at companies with large historical datasets. This also deepens your understanding of Iceberg's physical layout vs. logical metadata.

---

POC 9 — "Real-Time Aggregate Materialized View with Flink + Iceberg + Trino"

Problem Statement

Business dashboards today either query a slow Iceberg table directly (stale by hours) or maintain a separate Redis/ClickHouse store for real-time metrics (duplicated schema, dual write bugs). Flink's streaming aggregation can maintain a pre-aggregated Iceberg table (e.g., hourly sales by region) that Trino queries in milliseconds, with freshness measured in seconds rather than hours. This POC builds a mini pipeline: Kafka events → Flink aggregation → Iceberg aggregate table → Trino query, and benchmarks freshness (time from event to queryable result) vs. the current batch approach.

Tech Stack

Kafka 3.x (event source)
Apache Flink 1.20 (windowed aggregation, tumbling or sliding windows)
Apache Iceberg 1.7 (REST catalog)
Trino 475+ (query layer)
Grafana 11.x (live dashboard, refreshes every 5 s)

2-Week Scope

Week 1: Choose a business metric your team cares about (order counts, event rates, error rates — anything with a natural aggregation). Write a Flink job with a 1-minute tumbling window that outputs aggregates to an Iceberg table. Confirm Trino can query the table and see updates within 2–3 minutes of event production.

Week 2: Tune the Flink checkpoint interval and Iceberg commit interval to reduce freshness to <60 seconds. Set up a Grafana dashboard reading from Trino. Document the latency profile: p50 and p95 time-to-queryable from Kafka produce timestamp. Compare to the current batch job's freshness.

What does NOT fit: sub-second freshness (use ClickHouse or Pinot for that), complex CEP patterns — keep to simple aggregations.

Demo Angle

A live Grafana dashboard that ticks every 60 seconds with real event counts, fed by Trino queries on an Iceberg table that Flink is writing to. Point at the Kafka producer and say "I just increased the event rate" — the dashboard updates within 90 seconds. The current batch job would take 2 hours to show the same change.

Stretch Goals

Compare Flink tumbling windows vs. session windows for the same metric
Add a late-data handling path (Kafka offset watermark + side output to a corrections Iceberg table)
Explore AWS Glue Iceberg materialized views (now GA) as a managed alternative

Why It Teaches You Something

Real-time aggregation on open table formats is the core use case that justifies the Kafka + Flink + Iceberg stack. Understanding the trade-offs (freshness vs. cost, tumbling vs. sliding windows, checkpoint interval vs. file count) is what separates an architect from an implementer.

---

POC 10 — "Kafka-Native Data Contracts: Schema Registry + Great Expectations + Iceberg Quality Tags"

Problem Statement

Your team probably has informal agreements about what each Kafka topic should contain, but no automated enforcement. When a producer team changes a field without notice, consumers break silently and data quality degrades invisibly. Data contracts — formalized schemas with machine-checked assertions — are the solution, but implementing them end-to-end (Confluent Schema Registry for Avro enforcement on the producer side, Great Expectations for semantic checks at the Iceberg sink, and Iceberg tags to mark "certified" snapshots) has not been done as a unified proof of concept at most companies. This POC builds that pipeline.

Tech Stack

Confluent Schema Registry 7.x (FULL_TRANSITIVE compatibility mode)
Apache Kafka 3.x + Flink 1.20 (ingestion)
Apache Iceberg 1.7 (snapshot tags: certified, quarantine)
Great Expectations 1.x (semantic validation suite)
dbt (schema tests as the last gate before tagging certified)

2-Week Scope

Week 1: Set up Schema Registry with FULL_TRANSITIVE compatibility on one Kafka topic. Write a Flink job that rejects events failing the registry check and routes them to a quarantine Iceberg branch. Confirm the happy-path producer and a bad producer both behave correctly.

Week 2: Add Great Expectations checks on the Iceberg table after each Flink commit (row count, null rates, value distributions). On pass: apply Iceberg tag certified_<timestamp>. On fail: snapshot stays untagged and an alert fires. Document the contract (schema + semantic assertions) as a YAML file checked into version control.

What does NOT fit: full Soda Core or Monte Carlo integration (multi-week), producer-side SDK (separate project).

Demo Angle

Deploy a "bad producer" that sends events with a null required field. Show the Schema Registry reject it at the wire level. Then deploy a "schema-valid but semantically wrong" producer (all prices = 0). Show Great Expectations catch it at the Iceberg layer and withhold the certified tag. Query main — only certified snapshots are visible. Data quality is enforced at three independent layers.

Stretch Goals

Publish the contract YAML schema as a team standard and submit to Confluent community extensions
Integrate with Slack: post a message when a snapshot fails certification
Explore Confluent's Tableflow as a managed alternative for the schema-enforcement layer

Why It Teaches You Something

Data contracts are the hottest topic in the data engineering space in 2025–2026 (Data + AI Summit, dbt Coalesce, Current conference all led with it). Implementing one end-to-end — not just talking about it — is a strong signal for staff and principal roles.

---

Recommended Starting Order

| Priority | POC | Rationale |

|----------|-----|-----------|

| 1 | POC 4 — DuckDB Local Lakehouse | Fastest to ship (1 day of setup), immediate team value, visceral demo |

| 2 | POC 2 — CDC Compaction Watchdog | Fixes a likely-existing problem, measurable win, operationally credible |

| 3 | POC 5 — Iceberg Incremental dbt | Direct cost impact, uses existing dbt models, low infrastructure risk |

| 4 | POC 3 — WAP Data Quality Gate | Architecture story, compliance angle, moderate complexity |

| 5 | POC 1 — Zero-Restart Schema Evolution | Highest technical depth, best resume signal, requires Flink familiarity |

---

Work POC Ideas — Kafka + Iceberg + Lakehouse Stack

TL;DR (5 bullets)

POC 1 — "Zero-Restart Schema Evolution Pipeline"

Problem Statement

Tech Stack

2-Week Scope

Demo Angle

Stretch Goals

Why It Teaches You Something

POC 2 — "CDC Compaction Watchdog"

Problem Statement

Tech Stack

2-Week Scope

Demo Angle

Stretch Goals

Why It Teaches You Something

POC 3 — "Write-Audit-Publish: Git-Like Data Quality Gates"

Problem Statement

Tech Stack

2-Week Scope

Demo Angle

Stretch Goals

Why It Teaches You Something

POC 4 — "DuckDB Local Lakehouse: Query Production Iceberg in 30 Seconds"

Problem Statement

Tech Stack

2-Week Scope

Demo Angle

Stretch Goals

Why It Teaches You Something

POC 5 — "Iceberg Incremental dbt Models: Kill the Full-Scan Batch Job"

Problem Statement

Tech Stack

2-Week Scope

Demo Angle

Stretch Goals

Why It Teaches You Something

POC 6 — "Kafka Offset-to-Snapshot Lineage Dashboard"

Problem Statement

Tech Stack

2-Week Scope

Demo Angle

Stretch Goals

Why It Teaches You Something

POC 7 — "Multi-Catalog Federation: Query Iceberg + Hive + Postgres in One Trino Query"

Problem Statement

Tech Stack

2-Week Scope

Demo Angle

Stretch Goals

Why It Teaches You Something

POC 8 — "Hot/Cold Tiering: Automatic Iceberg Partition Migration to Cheaper Storage"

Problem Statement

Tech Stack

2-Week Scope

Demo Angle

Stretch Goals

Why It Teaches You Something

POC 9 — "Real-Time Aggregate Materialized View with Flink + Iceberg + Trino"

Problem Statement

Tech Stack

2-Week Scope

Demo Angle

Stretch Goals

Why It Teaches You Something

POC 10 — "Kafka-Native Data Contracts: Schema Registry + Great Expectations + Iceberg Quality Tags"

Problem Statement

Tech Stack

2-Week Scope

Demo Angle

Stretch Goals

Why It Teaches You Something

Recommended Starting Order

Sources Consulted