Day 48 — Word Break

Coding problem

ProblemWord Break
LeetCode ID(s)LeetCode #139
DifficultyMedium
PatternDP + Trie / Hash
Company tagsGoogle, Meta
Suggested time25m

Solution outline (coding)

  • dp[i] = whether s[0:i] is breakable; try every word length ending at i.
  • Check substring against a word set (hash set or trie).
  • Early exit if dp[len(s)] is true.

Time complexity: O(n² · L) naive; better with trie/pruning.

Space complexity: O(n) — boolean/string DP array.

Show Python solution
from typing import List


class Solution:
  def wordBreak(self, s: str, wordDict: List[str]) -> bool:
    words = set(wordDict)
    n = len(s)
    dp = [False] * (n + 1)
    dp[0] = True
    for i in range(1, n + 1):
      for j in range(i):
        if dp[j] and s[j:i] in words:
          dp[i] = True
          break
    return dp[n]


# Input:  s = "leetcode", wordDict = ["leet","code"]
# Output: True

SQL interview practice

1. Interview question

Companies / track: Google, Meta

Meta and Google both lean on ads and product surfaces: event streams, CTR-style rates, windowed metrics, and careful handling of identity joins.

What you are asked to write (SQL prompt):

Frame this as metrics work for **Google, Meta**-scale surfaces (ads, product, or engagement — as the tables suggest).
dictionary words, text(doc_id, content). BigQuery: implement word-break-like segmentation via UDF and compute % of text that cleanly segments.

Tables implied by the prompt:

  • text(doc_id, content)

Engine: BigQuery — use its date, array, and approximate functions as documented.

2. Solution outline

  • Clarify out loud: result grain (one row per what?), join keys, time zone, and any ORDER BY / LIMIT / tie-breakers.
  • Map DP + Trie / Hash to SQL: say the relational equivalent (e.g. hash map → GROUP BY + key; two pointers → ordered window + filter).
  • Arrays / UDFs: UNNEST and offsets for index; say when logic belongs in SQL vs a UDF.
  • Structure: CTEs (WITH) — one step per CTE; validate on a tiny slice (counts, nulls, duplicates).
Show SQL solution (BigQuery)

Main query

SELECT doc_id, content,
  -- Prefer JavaScript UDF or batch job for full word_break; SQL flags coverage ratio.
  (LENGTH(content) - LENGTH(REPLACE(LOWER(content), 'leet', ''))) / 4 AS rough_token_hits
FROM text;