Day 12 — Longest Substring Without Repeating Characters

Coding problem

ProblemLongest Substring Without Repeating Characters
LeetCode ID(s)LeetCode #3
DifficultyMedium
PatternSliding Window
Company tagsMeta, Netflix
Suggested time20m

Solution outline (coding)

  • Use a sliding window with left L and right R; expand R while characters are unique.
  • When you see a repeat, advance L past the previous index of that character (hash map: char → last index).
  • Maximum window size over the scan is the answer.

Time complexity: O(n) — each index enters and leaves the window at most once.

Space complexity: O(min(n, |Σ|)) — charset size for the map.

Show Python solution
class Solution:
  def lengthOfLongestSubstring(self, s: str) -> int:
    last = {}
    best = 0
    j = 0
    for i, ch in enumerate(s):
      if ch in last and last[ch] >= j:
        j = last[ch] + 1
      last[ch] = i
      best = max(best, i - j + 1)
    return best


# Input:  s = "abcabcbb"
# Output: 3

SQL interview practice

1. Interview question

Companies / track: Meta, Netflix

Meta + Netflix: retention and engagement questions on top of noisy logs—cohorts, streaks, and user-level history.

What you are asked to write (SQL prompt):

Frame this as metrics work for **Meta, Netflix**-scale surfaces (ads, product, or engagement — as the tables suggest).
sessions(user_id, ts, page). BigQuery: define repeated page as same page visited twice within 10 minutes for a user; find longest interval by duration without repeated pages per user.

Tables implied by the prompt:

  • sessions(user_id, ts, page)

Engine: BigQuery — use its date, array, and approximate functions as documented.

2. Solution outline

  • Clarify out loud: result grain (one row per what?), join keys, time zone, and any ORDER BY / LIMIT / tie-breakers.
  • Map Sliding Window to SQL: say the relational equivalent (e.g. hash map → GROUP BY + key; two pointers → ordered window + filter).
  • Filter time first: predicate on DATE(ts) / partition column before heavy joins; state the window in plain English.
  • Sessions / streaks: often LAG/LEAD or gap flags, then aggregate; check boundary dates.
  • Arrays / UDFs: UNNEST and offsets for index; say when logic belongs in SQL vs a UDF.
  • Rates: SAFE_DIVIDE or NULLIF; define numerator and denominator.
  • Structure: CTEs (WITH) — one step per CTE; validate on a tiny slice (counts, nulls, duplicates).
Show SQL solution (BigQuery)

Main query

WITH ordered AS (
  SELECT
    user_id,
    ts,
    page,
    LAG(page) OVER (PARTITION BY user_id ORDER BY ts) AS prev_page,
    LAG(ts) OVER (PARTITION BY user_id ORDER BY ts) AS prev_ts
  FROM sessions
),
flags AS (
  SELECT
    user_id,
    ts,
    CASE
      WHEN page = prev_page AND TIMESTAMP_DIFF(ts, prev_ts, MINUTE) <= 10 THEN 1
      ELSE 0
    END AS repeat_within_10m
  FROM ordered
),
grp AS (
  SELECT
    user_id,
    ts,
    SUM(CASE WHEN repeat_within_10m = 0 THEN 1 ELSE 0 END) OVER (PARTITION BY user_id ORDER BY ts) AS segment_id
  FROM flags
)
SELECT user_id, segment_id, TIMESTAMP_DIFF(MAX(ts), MIN(ts), SECOND) AS span_sec
FROM grp
GROUP BY user_id, segment_id;