Snapshot Array · SecondChair.dev

Day 58 — Snapshot Array

Coding problem

Problem	Snapshot Array
LeetCode ID(s)	LeetCode #1146
Difficulty	Medium
Pattern	Bin. Search / Design
Company tags	OpenAI
Suggested time	25m

Solution outline (coding)

Store an array of values per index; each set appends (snap_id, value) or uses copy-on-write.
snap increments id; get binary-searches the history at index for largest snap ≤ target.

Time complexity: O(log S) per get if S snaps per cell; set amortized depending on impl.

Space complexity: O(n · S) worst — history per cell.

Show Python solution

import bisect
from typing import List


class SnapshotArray:
  def __init__(self, length: int):
    self.arr = [[(0, 0)] for _ in range(length)]
    self.snap_id = 0

  def set(self, index: int, val: int) -> None:
    if self.arr[index][-1][0] == self.snap_id:
      self.arr[index][-1] = (self.snap_id, val)
    else:
      self.arr[index].append((self.snap_id, val))

  def snap(self) -> int:
    self.snap_id += 1
    return self.snap_id - 1

  def get(self, index: int, snap_id: int) -> int:
    pairs = self.arr[index]
    i = bisect.bisect_right(pairs, (snap_id, 10**9)) - 1
    return pairs[i][1]


# Input:  length 3, set/get/snap pattern
# Output: value at index for historical snap

SQL interview practice

1. Interview question

Companies / track: OpenAI

OpenAI / API-scale products: SQL often covers usage logs, experiments, and evaluation pipelines—clarity on keys and grain matters as much as syntax.

What you are asked to write (SQL prompt):

Frame this as metrics work for **OpenAI**-scale surfaces (ads, product, or engagement — as the tables suggest).
snapshots(table_name, snapshot_ts, payload). BigQuery: model SnapshotArray semantics to reconstruct state at time T and compute diff between T1 and T2.

Tables implied by the prompt:

snapshots(table_name, snapshot_ts, payload)

Engine: BigQuery — use its date, array, and approximate functions as documented.

2. Solution outline

Clarify out loud: result grain (one row per what?), join keys, time zone, and any ORDER BY / LIMIT / tie-breakers.
Map Bin. Search / Design to SQL: say the relational equivalent (e.g. hash map → GROUP BY + key; two pointers → ordered window + filter).
Arrays / UDFs: UNNEST and offsets for index; say when logic belongs in SQL vs a UDF.
Structure: CTEs (WITH) — one step per CTE; validate on a tiny slice (counts, nulls, duplicates).

Show SQL solution (BigQuery)

Main query

WITH s1 AS (SELECT * FROM snapshots WHERE snapshot_ts = @t1),
s2 AS (SELECT * FROM snapshots WHERE snapshot_ts = @t2)
SELECT COALESCE(s1.table_name, s2.table_name) AS table_name,
  TO_JSON_STRING(s1.payload) AS before_json,
  TO_JSON_STRING(s2.payload) AS after_json
FROM s1
FULL OUTER JOIN s2 USING (table_name);

Day 58: Snapshot Array