Day 12 — CAP theorem and consistency models

Phase 1: Foundations & Frameworks | Category: Distributed Systems

Why This Matters for Data Engineering Interviews

CAP theorem is one of the first things senior candidates are expected to reason about when defining non-functional requirements. But here’s the differentiator: junior candidates recite the definition; senior candidates apply it to real data pipeline decisions. As one LinkedIn post puts it: “System Design Red Flag: Stopping at the CAP Theorem. Interviewers know you memorized CAP. The pro move: mention the PACELC theorem.”

For data engineers specifically, consistency trade-offs show up in every pipeline design: Should your streaming pipeline prioritize freshness (availability) or correctness (consistency)? Can your dashboard tolerate stale reads? What happens when a Kafka partition leader fails mid-write?

CAP Theorem: The Foundation

Eric Brewer’s theorem states that a distributed system can guarantee at most two of three properties:

C — Consistency: Every read receives the most recent write (or an error). All nodes see the same data at the same time.

A — Availability: Every request to a non-failing node receives a response (not necessarily the latest data).

P — Partition Tolerance: The system continues to operate despite network partitions (messages lost or delayed between nodes).

The Key Simplification

In any real distributed system, network partitions will happen. You can’t opt out of P. So the actual choice is:

During a partition, do you prioritize Consistency or Availability?

That’s it. CAP reduces to a single binary decision under failure.

Network partition occurs between Node A and Node B:

CP choice: Node B stops serving reads/writes until partition heals
    → No stale data, but some requests fail
    → "I'd rather be correct than fast"

AP choice: Node B continues serving from its local state
    → All requests succeed, but may return stale data
    → "I'd rather be available than correct"

CP vs AP: Real Database Examples

Topic	Details
Postgre SQL	(single leader) CPFollowers can’t accept writes during partition from leader; reads may block OLTP, financial systems
Google Spanner	CPUses True Time + Paxos for global strong consistency; may increase latency during partition Global financial, inventory
MongoDB	(default) CPPrimary unavailable → writes fail until new primary elected Document store, transactional
HBase / Bigtable	CPRegion server failure → region unavailable until recovery Large-scale analytics
Cassandra	APContinues serving reads/writes from available replicas; repairs later High-write throughput, IoT, metrics
DynamoDB	AP (default) Eventually consistent reads by default; strong consistency optional User profiles, session stores
CockroachDB	CPRaft consensus; unavailable during partition for affected ranges Distributed SQL
Redis Cluster	APContinues serving from available shards; possible stale reads Caching, session state

Important nuance: These aren’t permanent labels. Many databases offer tunable consistency:

DynamoDB: eventual consistency by default, strong consistency option per-read
Cassandra: tunable via consistency level (ONE, QUORUM, ALL)
MongoDB: tunable write concern and read preference

What to say in interviews: “CAP isn’t a permanent classification — it’s a per-request trade-off. DynamoDB is AP by default but I can request strongly consistent reads. Cassandra with QUORUM reads and writes behaves like a CP system for those operations. The choice depends on the specific access pattern, not just the database.”

Beyond CAP: The PACELC Theorem

CAP only describes behavior during partitions. But partitions are rare. What about the 99.9% of the time when the network is healthy? This is where PACELC shines:

If Partition (P):      Choose Availability (A) or Consistency (C)  Else (E):      Choose Latency (L) or Consistency (C)

The “Else” branch is the critical addition: even without failures, enforcing strong consistency costs latency. Synchronous replication to all nodes before acknowledging a write is slow. Async replication is fast but eventually consistent.

Topic	Details
DynamoDB	PA (available) EL (low latency, eventual) PA/EL
Cassandra	PA (available) EL (low latency, tunable) PA/EL
Spanner	PC (consistent) EC (consistent, higher latency) PC/EC
Postgre SQL	(sync replicas) PC (consistent) EC (consistent, replication lag) PC/EC
MongoDB	PC (consistent) EC (consistent) PC/EC
Cosmos DB	Tunable Tunable (5 levels) Tunable

The senior insight: Most systems spend 99%+ of their time in the “Else” state. The latency-vs-consistency trade-off during normal operation matters far more day-to-day than the availability-vs-consistency trade-off during rare partitions. When an interviewer asks about your database choice, reasoning about EL/EC is more practical than PA/PC.

Consistency Models Spectrum

From strongest to weakest:

1. Strong (Linearizable) Consistency

Every read returns the most recent write, globally
As if there’s a single copy of the data
Cost: high latency (must coordinate across replicas before acknowledging)
Examples: Google Spanner, PostgreSQL (single primary), CockroachDB

When you need it: Financial transactions, inventory counts, seat reservations — anywhere a stale read causes real business damage.

2. Sequential Consistency

All operations appear in some total order, and each client’s operations appear in the order they were issued
Weaker than linearizable: the total order may not match real-time order
Rarely discussed explicitly in interviews, but important to understand the spectrum

3. Causal Consistency

Operations that are causally related are seen in the same order by all nodes
Concurrent (unrelated) operations may be seen in different orders
Example: User A posts a message, User B replies. Everyone sees the post before the reply. But two unrelated posts may appear in different orders on different nodes.
Cost: moderate — track causal dependencies, not full global order

When it’s useful: Social feeds, chat applications, collaborative editing — where cause-and-effect ordering matters but global ordering is overkill.

4. Eventual Consistency

If no new writes occur, all replicas will eventually converge to the same value
No guarantee about how long “eventually” takes (could be milliseconds, could be seconds)
Cost: low latency, high availability
Examples: Cassandra (default), DynamoDB (default), DNS, CDN caches

When it’s acceptable: User profile reads, recommendation feeds, like counts, analytics dashboards — where showing a slightly stale value for a few seconds doesn’t cause harm.

5. Strong Eventual Consistency (SEC)

Like eventual consistency, but guarantees that once all replicas receive the same set of updates, they converge to the same state regardless of the order updates were applied
Achieved via CRDTs (Conflict-Free Replicated Data Types) or operational transformation
Example: Collaborative editors (Google Docs), Riak’s CRDT support

How This Applies to Data Pipeline Design

This is where you differentiate from generic SWE candidates — apply CAP/PACELC to data engineering specifically:

Scenario 1: Streaming Pipeline Writes to Multiple Stores

Kafka → Flink → Write to both Cassandra (serving) AND S3/Iceberg (warehouse)

Question: What happens if the Cassandra write succeeds but the S3 write fails?

Strong consistency approach: Use Flink’s 2-phase commit. Don’t acknowledge the Kafka offset until both writes succeed. Cost: higher latency, pipeline stalls if either sink is slow.
Eventual consistency approach: Write to Cassandra immediately, buffer S3 writes, retry failures. Dashboard sees real-time data; warehouse catches up within minutes. Cost: temporary inconsistency between serving and warehouse.

What to say: “I’d choose eventual consistency here. The serving layer needs low-latency writes for real-time dashboards. The warehouse is for batch analytics where minutes of delay are fine. I’d use Flink checkpointing to guarantee at-least-once delivery to both sinks, with idempotent writes to handle retries.”

Scenario 2: Multi-Region Data Warehouse

US users write to US warehouse replica  EU users write to EU warehouse replica  Both replicas sync asynchronously

Question: A report run in the US shows different numbers than the same report in the EU.

CP approach: All writes go to a single primary (e.g., US). EU reads from a follower. Guarantees consistency but adds cross-Atlantic latency to EU writes.
AP approach: Both regions accept writes independently. Async sync. Reports may temporarily disagree. “Eventually consistent analytics” — acceptable if the dashboard refreshes every 5 minutes anyway.

Scenario 3: Feature Store for ML

Batch pipeline computes features daily → Feature Store  Streaming pipeline updates features in real-time → Feature Store  ML model reads features at inference time

Question: Can the model read a stale feature that hasn’t been updated by the streaming pipeline yet?

If the model uses a stale feature, it might make a slightly worse prediction. For recommendations, this is fine (AP/EL).
For fraud detection, a stale feature could mean missing a fraudulent transaction. Need strong consistency (CP/EC) — wait for the streaming pipeline to update before serving the feature.

The Per-Feature Consistency Pattern

Modern systems don’t make a single consistency choice for the whole system. They make different choices for different features:

FeatureConsistency ChoiceRationaleAccount balanceStrong (CP)Incorrect balance = real money lostLike count on a postEventual (AP)1,003 vs 1,005 likes — nobody noticesSeat reservationStrong (CP)Double-booking = two angry customersContent recommendationsEventual (AP)Slightly stale recs are fineAnalytics dashboardEventual (AP)“Data as of 5 minutes ago” is acceptableAd spend trackingStrong (CP)Inaccurate spend = billing disputesUser profile displayEventual (AP)Showing old profile photo for 5 seconds is fine

Interview phrasing: “I wouldn’t make a blanket consistency choice for the entire system. The payment pipeline needs strong consistency — a stale read on an account balance could cause an overdraft. But the engagement analytics dashboard can tolerate eventual consistency — showing metrics from 30 seconds ago is perfectly acceptable and lets me use a fast, highly available serving layer.”

Interview Questions

Q1: “You’re designing a global analytics platform for Netflix that serves dashboards to teams in the US, EU, and APAC. How do you think about consistency?”

Model Answer: “Analytics dashboards are a classic AP/EL use case. Teams can tolerate viewing data that’s minutes old — nobody makes a split-second business decision based on a dashboard that refreshes every 30 seconds. I’d deploy read replicas in each region for low-latency queries, with async replication from the primary warehouse in US-West. PACELC classification: PA/EL — during a partition, dashboards stay available with potentially stale data; during normal operation, I prioritize low query latency over perfect consistency. The replication lag would be seconds to a few minutes, which is well within the tolerance for analytics use cases. The one exception: if there’s a financial reporting dashboard used for revenue reconciliation, that needs strong consistency — I’d route those queries to the primary rather than a regional replica.”

Q2: “How does CAP theorem apply to a Kafka-based streaming pipeline?”

Model Answer: “Kafka itself is a CP system within a single cluster — it uses ISR (In-Sync Replicas) and requires acknowledgment from replicas before confirming a write with acks=all. During a broker partition, Kafka elects a new leader from the ISR, which may briefly make affected partitions unavailable (CP behavior). For the pipeline consuming from Kafka, the consistency choice applies to the sink: if I’m writing to Cassandra with consistency level ONE, that’s AP — fast writes, eventual consistency. If I’m writing to a transactional database with synchronous commits, that’s CP. The pipeline architecture lets me make different consistency choices at the sink level. For a fraud detection pipeline, I’d choose CP sinks with exactly-once semantics. For a clickstream analytics pipeline, AP sinks with at-least-once delivery and deduplication downstream. The key principle: match the consistency guarantee to the business cost of inconsistency.”

Think About This

You’re in a Google interview. The prompt: “Design the data infrastructure for Google Ads billing. Advertisers are charged based on clicks. The billing system must be accurate, but advertisers also want a real-time spend dashboard.”

Walk through:

What consistency model does the billing system need? (Strong — CP/EC. An incorrect charge is a billing dispute. Use Spanner or a strongly consistent database.)
What consistency model does the real-time spend dashboard need? (Eventual — AP/EL. Showing spend from 30 seconds ago is fine. Use a read replica or materialized view with async refresh.)
Can both be served from the same database? (Yes, but with different read paths. Billing reads from the primary with strong consistency. Dashboard reads from a read replica or cache with eventual consistency. Same data, different consistency per consumer.)
What’s the PACELC classification? (Billing: PC/EC — consistent during partitions and normal operation, accepts higher latency. Dashboard: PA/EL — available during partitions, low latency during normal operation.)

The insight: this is the “per-feature consistency” pattern in action. One system, two consumers, two different consistency requirements. The architecture must support both simultaneously.

Quick Reference

CAP in practice = CP or AP during partition. Network partitions are inevitable, so choose: correctness (CP) or availability (AP).
PACELC extends CAP: Even without partitions, you trade Latency vs Consistency. This is the trade-off that matters 99% of the time. Mention PACELC to stand out.
Consistency spectrum: Strong (linearizable) → Causal → Eventual → Strong Eventual (CRDTs). Choose the weakest model your use case can tolerate — stronger = more expensive.
Per-feature consistency: Don’t pick one model for the whole system. Billing = strong. Dashboards = eventual. Recommendations = eventual. Match consistency to the business cost of being wrong.
For data pipelines: Consistency choice shows up at the sink. Same Kafka topic can feed a CP sink (transactional DB for billing) and an AP sink (Cassandra for dashboards) simultaneously.
Tunable consistency: DynamoDB, Cassandra, Cosmos DB all let you choose per-request. “The database is AP by default but I can request strong consistency for critical reads.”

Tomorrow’s Preview

Day 13: Partitioning & Sharding Strategies — Hash partitioning vs range partitioning, consistent hashing, partition keys in Kafka, DynamoDB, and BigQuery. Handling hot partitions and data skew — one of the most practical distributed systems topics for data engineering interviews.

Day 12: CAP theorem and consistency models