Day 87 — Debug & Optimize Caching System
Coding problem
| Problem | Debug & Optimize Caching System |
| LeetCode ID(s) | — |
| Difficulty | Medium |
| Pattern | Debug / Design |
| Company tags | OpenAI |
| Suggested time | 30m |
Solution outline (coding)
- Reproduce the bug with minimal input; add logging around cache hits/misses.
- Check TTL, stampede, eviction policy, and key collision handling.
- Optimize hot path after correctness — measure before micro-optimizing.
Time complexity: Varies — debugging session.
Space complexity: Varies.
Show Python solution
class ReviewDay:
"""Practice / review: Debug & Optimize Caching System."""
def practice_plan(self):
return [
"Pick 2–3 problems from this phase; re-solve timed without notes.",
"For each: pattern name, time/space complexity, one alternative approach.",
]
# Input: (your choice of problems from this week or phase)
# Output: a short list of gaps to drill before the next sessionSQL interview practice
1. Interview question
Companies / track: OpenAI
OpenAI / API-scale products: SQL often covers usage logs, experiments, and evaluation pipelines—clarity on keys and grain matters as much as syntax.
What you are asked to write (SQL prompt):
Frame this as metrics work for **OpenAI**-scale surfaces (ads, product, or engagement — as the tables suggest).
cache_events(key, ts, event_type, latency_ms). BigQuery: debug cache by computing hit/miss ratios, stale reads, and P99 latency before/after config change.
Tables implied by the prompt:
cache_events(key, ts, event_type, latency_ms)
Engine: BigQuery — use its date, array, and approximate functions as documented.
2. Solution outline
- Clarify out loud: result grain (one row per what?), join keys, time zone, and any
ORDER BY/LIMIT/ tie-breakers. - Map Debug / Design to SQL: say the relational equivalent (e.g. hash map →
GROUP BY+ key; two pointers → ordered window + filter). - Rates:
SAFE_DIVIDEorNULLIF; define numerator and denominator. - Percentiles: approx vs exact; say so if you use
APPROX_QUANTILES. - Structure: CTEs (
WITH) — one step per CTE; validate on a tiny slice (counts, nulls, duplicates).
Show SQL solution (BigQuery)
Main query
SELECT event_type,
COUNTIF(event_type = 'hit') / COUNT(*) AS hit_ratio,
APPROX_QUANTILES(latency_ms, 100)[OFFSET(99)] AS p99
FROM cache_events
GROUP BY event_type;