Day 57 — Design: Uber/Lyft Surge Pricing Data Pipeline

Phase 2: Deep Dives | Category: System Design Practice

The Prompt

“Design the data infrastructure for a ride-sharing surge pricing system. The system must detect demand-supply imbalances at the sub-zone level, update surge multipliers within 30 seconds of imbalance onset, handle 5 million location updates per minute during peak hours, and serve pricing decisions with p99 < 200ms latency.”

This combines geospatial indexing, real-time aggregation, and low-latency serving.

Step 1: Requirements

Functional requirements

Ingest real-time location updates from drivers and riders
Divide city into zones; aggregate supply (drivers) + demand (requests) per zone
Compute demand/supply ratio per zone per time window
Trigger surge multipliers when ratio exceeds thresholds
Serve current surge multiplier for any location to the pricing API

Non-functional requirements

Scale:         5M updates/min ≈ 83K updates/sec
Latency:       Surge update: < 30s from imbalance onset
              Serving lookup: p99 < 200ms
Availability:  99.99% (pricing failure = no rides)
Consistency:   Eventual OK (1–2 min staleness tolerable)
Precision:     Zone granularity ~0.5–1 km²
Storage:       Current state in Redis; historical in lakehouse
Scope:         Real-time computation + serving; not the ML model itself

Step 2: Scale Estimation

Location updates:
  5M/min ≈ 83K/sec
  Payload ~100 bytes → ~8 MB/sec ingest

Zone cardinality:
  NYC ~800 km² / ~0.5 km² per zone → ~1,600 zones
  Top 10 cities → ~10,000 zones
  Zone state fits easily in memory

Pricing API:
  ~50K/sec peak lookups (illustrative)
  Each lookup: compute zone + Redis GET

Historical storage:
  83K/sec × 100B × 86,400 sec/day ≈ 700 GB/day raw
  Parquet+Zstd ≈ ~117 GB/day
  ~43 TB/year (order-of-magnitude)

Step 3: Geospatial Indexing (Core Choice)

Raw latitude/longitude is continuous and not aggregation-friendly. You need a spatial index.

Common choices:

Index	Shape	Best for	Weakness
H3	Hexagons	Even neighbor expansion, hierarchical	Pentagon artifacts (rare)
Geohash	Rectangles	Simple KV stores, string prefix	Edge distortion, adjacency artifacts
S2	Quad-sphere	Polygon containment	Adjacency less uniform
Quadtree	Recursive squares	Variable density	Harder distributed sharding

Why H3 fits surge pricing:

H3 (hexagonal hierarchical index):
  - Built by Uber; battle-tested for city zoning + neighbor expansion
  - Convert lat/lng → cell id at a chosen resolution
  - Each hex has 6 equidistant neighbors → smooth adjacency operations

Example resolutions:
  Res 9: ~0.1 km² (street-block)
  Res 7: ~5.2 km² (neighborhood)  ← common for surge computation
  Res 5: ~252 km² (district)

H3 usage:

import h3

def location_to_zone(lat: float, lng: float, resolution: int = 7) -> str:
    return h3.latlng_to_cell(lat, lng, resolution)

zone = location_to_zone(37.7749, -122.4194)
neighbors = h3.grid_disk(zone, k=1)  # zone + 6 neighbors

Step 4: Full Architecture

┌─────────────────────────────────────────────────────────────────────┐
│ EVENT SOURCES                                                       │
│  Driver app: {driver_id, lat, lng, ts, status}                      │
│  Rider app:  {rider_id, pickup_lat, pickup_lng, ts}                 │
│  Ride events: completion → driver becomes available                  │
└──────────────────────────┬──────────────────────────────────────────┘
                           ↓ (Avro/Protobuf + Schema Registry)
┌─────────────────────────────────────────────────────────────────────┐
│ KAFKA (partition by h3_zone_id)                                     │
│  driver_locations (key=h3_zone_id)                                  │
│  rider_requests   (key=h3_zone_id)                                  │
│  ride_events      (key=h3_zone_id)                                  │
│                                                                     │
│ Key insight: partition by zone (not driver_id) so per-zone state     │
│ is local in the stream processor (minimize shuffle).                │
└──────────────────┬──────────────────────────────────────────────────┘
                   ↓
┌─────────────────────────────────────────────────────────────────────┐
│ FLINK STREAM PROCESSOR                                              │
│  1) H3 enrichment (lat/lng → h3_zone_id)                             │
│  2) Keyed state per zone with sliding windows                        │
│     - available_drivers (last 2 min)                                 │
│     - pending_requests (last 2 min)                                  │
│  3) Compute ratio and surge multiplier                               │
│  4) Emit on change or every 30s heartbeat                            │
└──────────────────┬───────────────────┬──────────────────────────────┘
                   ↓                   ↓
┌──────────────────────┐  ┌──────────────────────────────────────────┐
│ REDIS (Serving)       │  │ S3/Iceberg (Analytics)                   │
│ Key: surge:{zone}     │  │ Bronze: raw updates/events               │
│ Value: multiplier+meta│  │ Silver: zone aggregates                   │
│ TTL: 120s             │  │ Gold: surge patterns + KPIs              │
└───────────────┬───────┘  └──────────────────────────────────────────┘
                ↓ (p99 < 200ms end-to-end)
┌─────────────────────────────────────────────────────────────────────┐
│ PRICING API                                                         │
│  Input: pickup_lat/lng                                               │
│  1) lat/lng → zone id (<1ms CPU)                                     │
│  2) Redis GET multiplier (<5–10ms)                                   │
│  3) Cache miss → return 1.0 (no surge)                               │
└─────────────────────────────────────────────────────────────────────┘

Step 5: Flink Stateful Aggregation (Deep Dive)

Two sliding windows per zone:

Window length: 2 minutes (reduce noise)
Slide: 10 seconds (meet the 30-second SLA)

Conceptual (pseudo) Flink:

# Conceptual example (not production-ready code)
driver_stream \
    .map(lambda e: (h3.latlng_to_cell(e.lat, e.lng, 7), e)) \
    .keyBy(lambda x: x[0]) \
    .window(SlidingEventTimeWindows.of(Time.minutes(2), Time.seconds(10))) \
    .aggregate(DriverCountAggregator())

Why sliding (not tumbling):

Tumbling 2 minutes → updates only every 2 minutes (too slow)
Sliding 10 seconds → updates every 10 seconds (3× headroom)

Handling drivers == 0 and smoothing boundaries:

def compute_surge_multiplier(zone_id: str, drivers: int, requests: int):
    if drivers == 0:
        return {"zone_id": zone_id, "multiplier": MAX_SURGE, "driver_shortage": True}

    ratio = requests / drivers
    multiplier = surge_curve(ratio)  # step function or ML model

    # Smooth with neighbors to avoid sharp zone boundaries
    neighbor_zones = h3.grid_disk(zone_id, k=1)
    neighbor_multipliers = [redis.get(z) for z in neighbor_zones if redis.exists(z)]
    if neighbor_multipliers:
        smoothed = 0.7 * multiplier + 0.3 * mean(neighbor_multipliers)
        multiplier = round(smoothed * 2) / 2  # round to nearest 0.5

    return {"zone_id": zone_id, "multiplier": multiplier, "ratio": ratio}

Step 6: Redis Serving Design

Key schema: surge:{h3_zone_id}
Example:    surge:8728309d7ffffff

Value (JSON):
{
  "multiplier": 2.0,
  "ratio": 1.8,
  "available_drivers": 5,
  "pending_requests": 9,
  "updated_at": "2026-04-13T09:30:45Z",
  "expires_at": "2026-04-13T09:32:45Z"
}

TTL: 120 seconds
  - If Flink stops, keys expire and serving falls back to 1.0
  - Prevents stale surge from persisting indefinitely

Capacity is tiny; Redis clustering is primarily for HA.

Step 7: Analytics Pipeline

Bronze (S3): raw location updates + ride events (Parquet+Zstd; partition by date+city)
Silver (Iceberg): zone aggregates (one row/zone/5-min window)
Gold: surge heatmaps, acceptance rate, elasticity, ops dashboards

Dual write:
  Flink → Redis (serving) and → Iceberg/S3 (analytics)

Point-in-time correctness matters when joining features (demand/supply state) to outcomes (accept/decline).

Step 8: Failure Modes and Mitigations

Failure	Impact	Mitigation
Flink crash	Updates stop; Redis expires → fallback 1.0	Checkpoint + fast restart; TTL avoids indefinite stale values
Redis primary failure	Serving unavailable briefly	Auto-failover; safe fallback to 1.0 during failover
Kafka lag	Supply/demand delayed	Lag monitoring; scale consumers; alert thresholds tied to SLA
Location spoofing	Fake supply suppresses surge	Validation (speed sanity), auth, anomaly detection
Demand spike	Surge slow if windows too coarse	10s slide detects within ~10–20s; can trigger on threshold breach

Step 9: Trade-offs to Articulate

Trade-off 1: Zone granularity (H3 resolution 7 vs 9)

Res 7: statistically meaningful counts per zone for ratio
Res 9: too sparse (many zones with 0–1 drivers), noisy ratios
Use Res 7 for surge; Res 9 for ETA/matching

Trade-off 2: 2-minute vs 5-minute windows

Too short (e.g., 30s): noisy, false surges
Too long (5m): slow reaction
2m is practical balance; 10s slide meets SLA

Trade-off 3: Redis TTL

TTL avoids stale surge persisting during pipeline failures
TTL=120s caps worst-case staleness before fallback

Interview Questions

Q1: New Year’s Eve spike (50K requests in 60s downtown). What happens?

Model answer:

Burst lands in Kafka by zone partitions.
Flink sliding windows reflect spike within 10–20 seconds.
Redis updates multipliers within 30 seconds of onset.
Neighbor smoothing avoids jarring boundaries and helps nearby zones transition.
Incentive loop: surge draws drivers; higher prices ration demand.

Q2: Drivers game surge by waiting outside zones then moving in. Detect?