Phase 2: Deep Dives | Category: System Design Practice
The Prompt
“Design the data infrastructure for a ride-sharing surge pricing system. The system must detect demand-supply imbalances at the sub-zone level, update surge multipliers within 30 seconds of imbalance onset, handle 5 million location updates per minute during peak hours, and serve pricing decisions with p99 < 200ms latency.”
This combines geospatial indexing, real-time aggregation, and low-latency serving.
Step 1: Requirements
Functional requirements
- Ingest real-time location updates from drivers and riders
- Divide city into zones; aggregate supply (drivers) + demand (requests) per zone
- Compute demand/supply ratio per zone per time window
- Trigger surge multipliers when ratio exceeds thresholds
- Serve current surge multiplier for any location to the pricing API
Non-functional requirements
Scale: 5M updates/min ≈ 83K updates/sec
Latency: Surge update: < 30s from imbalance onset
Serving lookup: p99 < 200ms
Availability: 99.99% (pricing failure = no rides)
Consistency: Eventual OK (1–2 min staleness tolerable)
Precision: Zone granularity ~0.5–1 km²
Storage: Current state in Redis; historical in lakehouse
Scope: Real-time computation + serving; not the ML model itself
Step 2: Scale Estimation
Location updates:
5M/min ≈ 83K/sec
Payload ~100 bytes → ~8 MB/sec ingest
Zone cardinality:
NYC ~800 km² / ~0.5 km² per zone → ~1,600 zones
Top 10 cities → ~10,000 zones
Zone state fits easily in memory
Pricing API:
~50K/sec peak lookups (illustrative)
Each lookup: compute zone + Redis GET
Historical storage:
83K/sec × 100B × 86,400 sec/day ≈ 700 GB/day raw
Parquet+Zstd ≈ ~117 GB/day
~43 TB/year (order-of-magnitude)
Step 3: Geospatial Indexing (Core Choice)
Raw latitude/longitude is continuous and not aggregation-friendly. You need a spatial index.
Common choices:
| Index | Shape | Best for | Weakness |
|---|---|---|---|
| H3 | Hexagons | Even neighbor expansion, hierarchical | Pentagon artifacts (rare) |
| Geohash | Rectangles | Simple KV stores, string prefix | Edge distortion, adjacency artifacts |
| S2 | Quad-sphere | Polygon containment | Adjacency less uniform |
| Quadtree | Recursive squares | Variable density | Harder distributed sharding |
Why H3 fits surge pricing:
H3 (hexagonal hierarchical index):
- Built by Uber; battle-tested for city zoning + neighbor expansion
- Convert lat/lng → cell id at a chosen resolution
- Each hex has 6 equidistant neighbors → smooth adjacency operations
Example resolutions:
Res 9: ~0.1 km² (street-block)
Res 7: ~5.2 km² (neighborhood) ← common for surge computation
Res 5: ~252 km² (district)
H3 usage:
import h3
def location_to_zone(lat: float, lng: float, resolution: int = 7) -> str:
return h3.latlng_to_cell(lat, lng, resolution)
zone = location_to_zone(37.7749, -122.4194)
neighbors = h3.grid_disk(zone, k=1) # zone + 6 neighbors
Step 4: Full Architecture
┌─────────────────────────────────────────────────────────────────────┐
│ EVENT SOURCES │
│ Driver app: {driver_id, lat, lng, ts, status} │
│ Rider app: {rider_id, pickup_lat, pickup_lng, ts} │
│ Ride events: completion → driver becomes available │
└──────────────────────────┬──────────────────────────────────────────┘
↓ (Avro/Protobuf + Schema Registry)
┌─────────────────────────────────────────────────────────────────────┐
│ KAFKA (partition by h3_zone_id) │
│ driver_locations (key=h3_zone_id) │
│ rider_requests (key=h3_zone_id) │
│ ride_events (key=h3_zone_id) │
│ │
│ Key insight: partition by zone (not driver_id) so per-zone state │
│ is local in the stream processor (minimize shuffle). │
└──────────────────┬──────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────────────┐
│ FLINK STREAM PROCESSOR │
│ 1) H3 enrichment (lat/lng → h3_zone_id) │
│ 2) Keyed state per zone with sliding windows │
│ - available_drivers (last 2 min) │
│ - pending_requests (last 2 min) │
│ 3) Compute ratio and surge multiplier │
│ 4) Emit on change or every 30s heartbeat │
└──────────────────┬───────────────────┬──────────────────────────────┘
↓ ↓
┌──────────────────────┐ ┌──────────────────────────────────────────┐
│ REDIS (Serving) │ │ S3/Iceberg (Analytics) │
│ Key: surge:{zone} │ │ Bronze: raw updates/events │
│ Value: multiplier+meta│ │ Silver: zone aggregates │
│ TTL: 120s │ │ Gold: surge patterns + KPIs │
└───────────────┬───────┘ └──────────────────────────────────────────┘
↓ (p99 < 200ms end-to-end)
┌─────────────────────────────────────────────────────────────────────┐
│ PRICING API │
│ Input: pickup_lat/lng │
│ 1) lat/lng → zone id (<1ms CPU) │
│ 2) Redis GET multiplier (<5–10ms) │
│ 3) Cache miss → return 1.0 (no surge) │
└─────────────────────────────────────────────────────────────────────┘
Step 5: Flink Stateful Aggregation (Deep Dive)
Two sliding windows per zone:
- Window length: 2 minutes (reduce noise)
- Slide: 10 seconds (meet the 30-second SLA)
Conceptual (pseudo) Flink:
# Conceptual example (not production-ready code)
driver_stream \
.map(lambda e: (h3.latlng_to_cell(e.lat, e.lng, 7), e)) \
.keyBy(lambda x: x[0]) \
.window(SlidingEventTimeWindows.of(Time.minutes(2), Time.seconds(10))) \
.aggregate(DriverCountAggregator())
Why sliding (not tumbling):
- Tumbling 2 minutes → updates only every 2 minutes (too slow)
- Sliding 10 seconds → updates every 10 seconds (3× headroom)
Handling drivers == 0 and smoothing boundaries:
def compute_surge_multiplier(zone_id: str, drivers: int, requests: int):
if drivers == 0:
return {"zone_id": zone_id, "multiplier": MAX_SURGE, "driver_shortage": True}
ratio = requests / drivers
multiplier = surge_curve(ratio) # step function or ML model
# Smooth with neighbors to avoid sharp zone boundaries
neighbor_zones = h3.grid_disk(zone_id, k=1)
neighbor_multipliers = [redis.get(z) for z in neighbor_zones if redis.exists(z)]
if neighbor_multipliers:
smoothed = 0.7 * multiplier + 0.3 * mean(neighbor_multipliers)
multiplier = round(smoothed * 2) / 2 # round to nearest 0.5
return {"zone_id": zone_id, "multiplier": multiplier, "ratio": ratio}
Step 6: Redis Serving Design
Key schema: surge:{h3_zone_id}
Example: surge:8728309d7ffffff
Value (JSON):
{
"multiplier": 2.0,
"ratio": 1.8,
"available_drivers": 5,
"pending_requests": 9,
"updated_at": "2026-04-13T09:30:45Z",
"expires_at": "2026-04-13T09:32:45Z"
}
TTL: 120 seconds
- If Flink stops, keys expire and serving falls back to 1.0
- Prevents stale surge from persisting indefinitely
Capacity is tiny; Redis clustering is primarily for HA.
Step 7: Analytics Pipeline
Bronze (S3): raw location updates + ride events (Parquet+Zstd; partition by date+city)
Silver (Iceberg): zone aggregates (one row/zone/5-min window)
Gold: surge heatmaps, acceptance rate, elasticity, ops dashboards
Dual write:
Flink → Redis (serving) and → Iceberg/S3 (analytics)
Point-in-time correctness matters when joining features (demand/supply state) to outcomes (accept/decline).
Step 8: Failure Modes and Mitigations
| Failure | Impact | Mitigation |
|---|---|---|
| Flink crash | Updates stop; Redis expires → fallback 1.0 | Checkpoint + fast restart; TTL avoids indefinite stale values |
| Redis primary failure | Serving unavailable briefly | Auto-failover; safe fallback to 1.0 during failover |
| Kafka lag | Supply/demand delayed | Lag monitoring; scale consumers; alert thresholds tied to SLA |
| Location spoofing | Fake supply suppresses surge | Validation (speed sanity), auth, anomaly detection |
| Demand spike | Surge slow if windows too coarse | 10s slide detects within ~10–20s; can trigger on threshold breach |
Step 9: Trade-offs to Articulate
Trade-off 1: Zone granularity (H3 resolution 7 vs 9)
- Res 7: statistically meaningful counts per zone for ratio
- Res 9: too sparse (many zones with 0–1 drivers), noisy ratios
- Use Res 7 for surge; Res 9 for ETA/matching
Trade-off 2: 2-minute vs 5-minute windows
- Too short (e.g., 30s): noisy, false surges
- Too long (5m): slow reaction
- 2m is practical balance; 10s slide meets SLA
Trade-off 3: Redis TTL
- TTL avoids stale surge persisting during pipeline failures
- TTL=120s caps worst-case staleness before fallback
Interview Questions
Q1: New Year’s Eve spike (50K requests in 60s downtown). What happens?
Model answer:
- Burst lands in Kafka by zone partitions.
- Flink sliding windows reflect spike within 10–20 seconds.
- Redis updates multipliers within 30 seconds of onset.
- Neighbor smoothing avoids jarring boundaries and helps nearby zones transition.
- Incentive loop: surge draws drivers; higher prices ration demand.
Q2: Drivers game surge by waiting outside zones then moving in. Detect?
Model answer:
- Detect offline via analytics (daily batch):
- boundary clustering behavior
- repeated appearance right after surge onset
- suspiciously stable GPS then sudden move into surge zone
- Produce a driver-level “surge-chasing score” for Trust & Safety review.
Self-Assessment: 5 Questions
- Why is H3 better than geohash for surge boundaries?
- Why partition Kafka by
h3_zone_idrather thandriver_id? - How is the 30-second SLA achieved (2m window, 10s slide)?
- What happens if Flink crashes for 3 minutes?
- Why blend neighbor multipliers (k-ring smoothing)?
Quick Reference
- Geospatial index: H3 res 7 for surge computation.
- Stream: Kafka partition by zone → Flink keyed state without heavy shuffle.
- Compute: 2-minute sliding window, 10-second slide; emit on change/heartbeat.
- Serve: Redis key per zone with TTL; API does lat/lng→zone then GET.
- Smooth: k-ring neighbor blending to avoid boundary artifacts.
- Analytics: dual-write zone metrics to Iceberg/S3 for training + ops.
Tomorrow’s Preview
Day 58: Design: CDN Analytics Pipeline — edge logs at massive scale, deduplication, aggregation, and real-time dashboards.