Day 58 — Snapshot Array
Coding problem
| Problem | Snapshot Array |
| LeetCode ID(s) | LeetCode #1146 |
| Difficulty | Medium |
| Pattern | Bin. Search / Design |
| Company tags | OpenAI |
| Suggested time | 25m |
Solution outline (coding)
- Store an array of values per index; each
setappends(snap_id, value)or uses copy-on-write. snapincrements id;getbinary-searches the history at index for largest snap ≤ target.
Time complexity: O(log S) per get if S snaps per cell; set amortized depending on impl.
Space complexity: O(n · S) worst — history per cell.
Show Python solution
import bisect
from typing import List
class SnapshotArray:
def __init__(self, length: int):
self.arr = [[(0, 0)] for _ in range(length)]
self.snap_id = 0
def set(self, index: int, val: int) -> None:
if self.arr[index][-1][0] == self.snap_id:
self.arr[index][-1] = (self.snap_id, val)
else:
self.arr[index].append((self.snap_id, val))
def snap(self) -> int:
self.snap_id += 1
return self.snap_id - 1
def get(self, index: int, snap_id: int) -> int:
pairs = self.arr[index]
i = bisect.bisect_right(pairs, (snap_id, 10**9)) - 1
return pairs[i][1]
# Input: length 3, set/get/snap pattern
# Output: value at index for historical snapSQL interview practice
1. Interview question
Companies / track: OpenAI
OpenAI / API-scale products: SQL often covers usage logs, experiments, and evaluation pipelines—clarity on keys and grain matters as much as syntax.
What you are asked to write (SQL prompt):
Frame this as metrics work for **OpenAI**-scale surfaces (ads, product, or engagement — as the tables suggest).
snapshots(table_name, snapshot_ts, payload). BigQuery: model SnapshotArray semantics to reconstruct state at time T and compute diff between T1 and T2.
Tables implied by the prompt:
snapshots(table_name, snapshot_ts, payload)
Engine: BigQuery — use its date, array, and approximate functions as documented.
2. Solution outline
- Clarify out loud: result grain (one row per what?), join keys, time zone, and any
ORDER BY/LIMIT/ tie-breakers. - Map Bin. Search / Design to SQL: say the relational equivalent (e.g. hash map →
GROUP BY+ key; two pointers → ordered window + filter). - Arrays / UDFs:
UNNESTand offsets for index; say when logic belongs in SQL vs a UDF. - Structure: CTEs (
WITH) — one step per CTE; validate on a tiny slice (counts, nulls, duplicates).
Show SQL solution (BigQuery)
Main query
WITH s1 AS (SELECT * FROM snapshots WHERE snapshot_ts = @t1),
s2 AS (SELECT * FROM snapshots WHERE snapshot_ts = @t2)
SELECT COALESCE(s1.table_name, s2.table_name) AS table_name,
TO_JSON_STRING(s1.payload) AS before_json,
TO_JSON_STRING(s2.payload) AS after_json
FROM s1
FULL OUTER JOIN s2 USING (table_name);