Phase 2: Deep Dives | Category: System Design Practice

The Prompt

“Design the data infrastructure for a ride-sharing surge pricing system. The system must detect demand-supply imbalances at the sub-zone level, update surge multipliers within 30 seconds of imbalance onset, handle 5 million location updates per minute during peak hours, and serve pricing decisions with p99 < 200ms latency.”

This combines geospatial indexing, real-time aggregation, and low-latency serving.

Step 1: Requirements

Functional requirements

  • Ingest real-time location updates from drivers and riders
  • Divide city into zones; aggregate supply (drivers) + demand (requests) per zone
  • Compute demand/supply ratio per zone per time window
  • Trigger surge multipliers when ratio exceeds thresholds
  • Serve current surge multiplier for any location to the pricing API

Non-functional requirements

Scale:         5M updates/min ≈ 83K updates/sec
Latency:       Surge update: < 30s from imbalance onset
              Serving lookup: p99 < 200ms
Availability:  99.99% (pricing failure = no rides)
Consistency:   Eventual OK (1–2 min staleness tolerable)
Precision:     Zone granularity ~0.5–1 km²
Storage:       Current state in Redis; historical in lakehouse
Scope:         Real-time computation + serving; not the ML model itself

Step 2: Scale Estimation

Location updates:
  5M/min ≈ 83K/sec
  Payload ~100 bytes → ~8 MB/sec ingest

Zone cardinality:
  NYC ~800 km² / ~0.5 km² per zone → ~1,600 zones
  Top 10 cities → ~10,000 zones
  Zone state fits easily in memory

Pricing API:
  ~50K/sec peak lookups (illustrative)
  Each lookup: compute zone + Redis GET

Historical storage:
  83K/sec × 100B × 86,400 sec/day ≈ 700 GB/day raw
  Parquet+Zstd ≈ ~117 GB/day
  ~43 TB/year (order-of-magnitude)

Step 3: Geospatial Indexing (Core Choice)

Raw latitude/longitude is continuous and not aggregation-friendly. You need a spatial index.

Common choices:

IndexShapeBest forWeakness
H3HexagonsEven neighbor expansion, hierarchicalPentagon artifacts (rare)
GeohashRectanglesSimple KV stores, string prefixEdge distortion, adjacency artifacts
S2Quad-spherePolygon containmentAdjacency less uniform
QuadtreeRecursive squaresVariable densityHarder distributed sharding

Why H3 fits surge pricing:

H3 (hexagonal hierarchical index):
  - Built by Uber; battle-tested for city zoning + neighbor expansion
  - Convert lat/lng → cell id at a chosen resolution
  - Each hex has 6 equidistant neighbors → smooth adjacency operations

Example resolutions:
  Res 9: ~0.1 km² (street-block)
  Res 7: ~5.2 km² (neighborhood)  ← common for surge computation
  Res 5: ~252 km² (district)

H3 usage:

import h3

def location_to_zone(lat: float, lng: float, resolution: int = 7) -> str:
    return h3.latlng_to_cell(lat, lng, resolution)

zone = location_to_zone(37.7749, -122.4194)
neighbors = h3.grid_disk(zone, k=1)  # zone + 6 neighbors

Step 4: Full Architecture

┌─────────────────────────────────────────────────────────────────────┐
│ EVENT SOURCES                                                       │
│  Driver app: {driver_id, lat, lng, ts, status}                      │
│  Rider app:  {rider_id, pickup_lat, pickup_lng, ts}                 │
│  Ride events: completion → driver becomes available                  │
└──────────────────────────┬──────────────────────────────────────────┘
                           ↓ (Avro/Protobuf + Schema Registry)
┌─────────────────────────────────────────────────────────────────────┐
│ KAFKA (partition by h3_zone_id)                                     │
│  driver_locations (key=h3_zone_id)                                  │
│  rider_requests   (key=h3_zone_id)                                  │
│  ride_events      (key=h3_zone_id)                                  │
│                                                                     │
│ Key insight: partition by zone (not driver_id) so per-zone state     │
│ is local in the stream processor (minimize shuffle).                │
└──────────────────┬──────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────────┐
│ FLINK STREAM PROCESSOR                                              │
│  1) H3 enrichment (lat/lng → h3_zone_id)                             │
│  2) Keyed state per zone with sliding windows                        │
│     - available_drivers (last 2 min)                                 │
│     - pending_requests (last 2 min)                                  │
│  3) Compute ratio and surge multiplier                               │
│  4) Emit on change or every 30s heartbeat                            │
└──────────────────┬───────────────────┬──────────────────────────────┘
                   ↓                   ↓
┌──────────────────────┐  ┌──────────────────────────────────────────┐
│ REDIS (Serving)       │  │ S3/Iceberg (Analytics)                   │
│ Key: surge:{zone}     │  │ Bronze: raw updates/events               │
│ Value: multiplier+meta│  │ Silver: zone aggregates                   │
│ TTL: 120s             │  │ Gold: surge patterns + KPIs              │
└───────────────┬───────┘  └──────────────────────────────────────────┘
                ↓ (p99 < 200ms end-to-end)
┌─────────────────────────────────────────────────────────────────────┐
│ PRICING API                                                         │
│  Input: pickup_lat/lng                                               │
│  1) lat/lng → zone id (<1ms CPU)                                     │
│  2) Redis GET multiplier (<5–10ms)                                   │
│  3) Cache miss → return 1.0 (no surge)                               │
└─────────────────────────────────────────────────────────────────────┘

Two sliding windows per zone:

  • Window length: 2 minutes (reduce noise)
  • Slide: 10 seconds (meet the 30-second SLA)

Conceptual (pseudo) Flink:

# Conceptual example (not production-ready code)
driver_stream \
    .map(lambda e: (h3.latlng_to_cell(e.lat, e.lng, 7), e)) \
    .keyBy(lambda x: x[0]) \
    .window(SlidingEventTimeWindows.of(Time.minutes(2), Time.seconds(10))) \
    .aggregate(DriverCountAggregator())

Why sliding (not tumbling):

  • Tumbling 2 minutes → updates only every 2 minutes (too slow)
  • Sliding 10 seconds → updates every 10 seconds (3× headroom)

Handling drivers == 0 and smoothing boundaries:

def compute_surge_multiplier(zone_id: str, drivers: int, requests: int):
    if drivers == 0:
        return {"zone_id": zone_id, "multiplier": MAX_SURGE, "driver_shortage": True}

    ratio = requests / drivers
    multiplier = surge_curve(ratio)  # step function or ML model

    # Smooth with neighbors to avoid sharp zone boundaries
    neighbor_zones = h3.grid_disk(zone_id, k=1)
    neighbor_multipliers = [redis.get(z) for z in neighbor_zones if redis.exists(z)]
    if neighbor_multipliers:
        smoothed = 0.7 * multiplier + 0.3 * mean(neighbor_multipliers)
        multiplier = round(smoothed * 2) / 2  # round to nearest 0.5

    return {"zone_id": zone_id, "multiplier": multiplier, "ratio": ratio}

Step 6: Redis Serving Design

Key schema: surge:{h3_zone_id}
Example:    surge:8728309d7ffffff

Value (JSON):
{
  "multiplier": 2.0,
  "ratio": 1.8,
  "available_drivers": 5,
  "pending_requests": 9,
  "updated_at": "2026-04-13T09:30:45Z",
  "expires_at": "2026-04-13T09:32:45Z"
}

TTL: 120 seconds
  - If Flink stops, keys expire and serving falls back to 1.0
  - Prevents stale surge from persisting indefinitely

Capacity is tiny; Redis clustering is primarily for HA.

Step 7: Analytics Pipeline

Bronze (S3): raw location updates + ride events (Parquet+Zstd; partition by date+city)
Silver (Iceberg): zone aggregates (one row/zone/5-min window)
Gold: surge heatmaps, acceptance rate, elasticity, ops dashboards

Dual write:
  Flink → Redis (serving) and → Iceberg/S3 (analytics)

Point-in-time correctness matters when joining features (demand/supply state) to outcomes (accept/decline).

Step 8: Failure Modes and Mitigations

FailureImpactMitigation
Flink crashUpdates stop; Redis expires → fallback 1.0Checkpoint + fast restart; TTL avoids indefinite stale values
Redis primary failureServing unavailable brieflyAuto-failover; safe fallback to 1.0 during failover
Kafka lagSupply/demand delayedLag monitoring; scale consumers; alert thresholds tied to SLA
Location spoofingFake supply suppresses surgeValidation (speed sanity), auth, anomaly detection
Demand spikeSurge slow if windows too coarse10s slide detects within ~10–20s; can trigger on threshold breach

Step 9: Trade-offs to Articulate

Trade-off 1: Zone granularity (H3 resolution 7 vs 9)

  • Res 7: statistically meaningful counts per zone for ratio
  • Res 9: too sparse (many zones with 0–1 drivers), noisy ratios
  • Use Res 7 for surge; Res 9 for ETA/matching

Trade-off 2: 2-minute vs 5-minute windows

  • Too short (e.g., 30s): noisy, false surges
  • Too long (5m): slow reaction
  • 2m is practical balance; 10s slide meets SLA

Trade-off 3: Redis TTL

  • TTL avoids stale surge persisting during pipeline failures
  • TTL=120s caps worst-case staleness before fallback

Interview Questions

Q1: New Year’s Eve spike (50K requests in 60s downtown). What happens?

Model answer:

  • Burst lands in Kafka by zone partitions.
  • Flink sliding windows reflect spike within 10–20 seconds.
  • Redis updates multipliers within 30 seconds of onset.
  • Neighbor smoothing avoids jarring boundaries and helps nearby zones transition.
  • Incentive loop: surge draws drivers; higher prices ration demand.

Q2: Drivers game surge by waiting outside zones then moving in. Detect?

Model answer:

  • Detect offline via analytics (daily batch):
    • boundary clustering behavior
    • repeated appearance right after surge onset
    • suspiciously stable GPS then sudden move into surge zone
  • Produce a driver-level “surge-chasing score” for Trust & Safety review.

Self-Assessment: 5 Questions

  • Why is H3 better than geohash for surge boundaries?
  • Why partition Kafka by h3_zone_id rather than driver_id?
  • How is the 30-second SLA achieved (2m window, 10s slide)?
  • What happens if Flink crashes for 3 minutes?
  • Why blend neighbor multipliers (k-ring smoothing)?

Quick Reference

  • Geospatial index: H3 res 7 for surge computation.
  • Stream: Kafka partition by zone → Flink keyed state without heavy shuffle.
  • Compute: 2-minute sliding window, 10-second slide; emit on change/heartbeat.
  • Serve: Redis key per zone with TTL; API does lat/lng→zone then GET.
  • Smooth: k-ring neighbor blending to avoid boundary artifacts.
  • Analytics: dual-write zone metrics to Iceberg/S3 for training + ops.

Tomorrow’s Preview

Day 58: Design: CDN Analytics Pipeline — edge logs at massive scale, deduplication, aggregation, and real-time dashboards.