Data Engineering Interview Prep — Senior & Staff
A 90-day deep learning program. 30 minutes a day. Each session teaches a topic from first principles so you can reason through any problem in an interview — not recall a memorized answer. The goal is genuine mastery.
Covers system design, data modeling, ETL architecture, streaming systems, and company-specific patterns.
Meta Netflix Google OpenAI Anthropic Airbnb
System Design
Senior and Staff data engineering system design. Use the calendar on the left (60 days published) or open any day below.
- Day 1 Day 1 — The 5-step data engineering system design framework
- Day 2 Day 2 — Functional vs non-functional requirements
- Day 3 Day 3 — Back-of-envelope estimation for data systems
- Day 4 Day 4 — Dimensional modeling fundamentals
- Day 5 Day 5 — Slowly changing dimensions (SCD)
- Day 6 Day 6 — Normalization vs denormalization trade-offs
- Day 7 Day 7 — Data Vault 2.0 and advanced modeling
- Day 8 Day 8 — ETL vs ELT architecture and trade-offs
- Day 9 Day 9 — Batch processing architecture
- Day 10 Day 10 — Stream processing fundamentals
- Day 11 Day 11 — Lambda vs Kappa architecture
- Day 12 Day 12 — CAP theorem and consistency models
- Day 13 Day 13 — Partitioning and sharding strategies
- Day 14 Day 14 — Replication and fault tolerance
- Day 15 Day 15 — Storage deep dive: SQL vs NoSQL
- Day 16 Day 16 — Data warehouse architecture
- Day 17 Day 17 — Data lake and lakehouse architecture
- Day 18 Day 18 — File formats and compression
- Day 19 Day 19 — Data quality frameworks
- Day 20 Day 20 — Data lineage and cataloging
- Day 21 Day 21 — Schema evolution and data contracts
- Day 22 Day 22 — Workflow orchestration patterns
- Day 23 Day 23 — Change data capture (CDC)
- Day 24 Day 24 — Design: real-time analytics dashboard
- Day 25 Day 25 — Design: event logging and telemetry system
- Day 26 Day 26 — Design: data warehouse for e-commerce
- Day 27 Day 27 — API design for data systems
- Day 28 Day 28 — Caching strategies for data systems
- Day 29 Day 29 — Pipeline observability and monitoring
- Day 30 Day 30 — Phase 1 review and self-assessment
- Day 31 Day 31 — Meta Data Infrastructure & Interview Patterns
- Day 32 Day 32 — Design: Meta News Feed Data Pipeline
- Day 33 Day 33 — Netflix Data Infrastructure & Interview Patterns
- Day 34 Day 34 — Design: Netflix Streaming Recommendation Data Pipeline
- Day 35 Day 35 — Google Data Engineering & GCP Deep Dive
- Day 36 Day 36 — Design: Large-Scale Search Analytics Pipeline (Google Style)
- Day 37 Day 37 — OpenAI Data Engineering & AI-Native Pipelines
- Day 38 Day 38 — Design: LLM Training Data Pipeline
- Day 39 Day 39 — Anthropic Data Engineering & Safety-First Design
- Day 40 Day 40 — Design: Distributed Search System for Billion Documents (Anthropic Style)
- Day 41 Day 41 — Activity Schema & Event Modeling
- Day 42 Day 42 — Graph Data Modeling
- Day 43 Day 43 — Time-Series Data Modeling
- Day 44 Day 44 — Exactly-Once Semantics Deep Dive
- Day 45 Day 45 — Stream-Table Duality & Materialized Views
- Day 46 Day 46 — Real-Time Feature Engineering
- Day 47 Day 47 — ML Pipeline Architecture
- Day 48 Day 48 — Embedding & Vector Data Pipelines
- Day 49 Day 49 — A/B Testing Data Infrastructure
- Day 50 Day 50 — Self-Serve Data Platform Architecture
- Day 51 Day 51 — Data Mesh vs Data Fabric
- Day 52 Day 52 — Multi-Tenancy in Data Systems
- Day 53 Day 53 — Data Security & Access Control
- Day 54 Day 54 — PII Handling & Privacy Engineering
- Day 55 Day 55 — Cost Optimization for Data Platforms
- Day 56 Day 56 — Performance Tuning Data Pipelines
- Day 57 Day 57 — Design: Uber/Lyft Surge Pricing Data Pipeline
- Day 58 Day 58 — Design: CDN Analytics Pipeline
- Day 59 Day 59 — Design: Real-Time Fraud Detection System
- Day 60 Day 60 — Phase 2 Comprehensive Review & Self-Assessment