Day 3 — Group Anagrams

Coding problem

ProblemGroup Anagrams
LeetCode ID(s)LeetCode #49
DifficultyMedium
PatternHash Map Grouping
Company tagsMeta, Google
Suggested time20m

Solution outline (coding)

  • For each word, build a canonical key: sorted characters, or a 26-length count signature.
  • Use that key to bucket words in a map: key -> list of words.
  • Return all groups with more than one word (or all values, depending on prompt).

Time complexity: O(n · k log k) if sorting each word of length k; O(n · k) with count vectors.

Space complexity: O(n · k) — storing all strings plus the map.

Show Python solution
from collections import defaultdict
from typing import List


class Solution:
  def groupAnagrams(self, strs: List[str]) -> List[List[str]]:
    groups: dict[tuple[str, ...], List[str]] = defaultdict(list)

    for s in strs:
      key = tuple(sorted(s))
      groups[key].append(s)

    return list(groups.values())


# Input:  strs = ["eat","tea","tan","ate","nat","bat"]
# Output: [["bat"],["nat","tan"],["ate","eat","tea"]] (order of groups/within groups may vary)

SQL interview practice

1. Interview question

Companies / track: Meta, Google

Meta and Google both lean on ads and product surfaces: event streams, CTR-style rates, windowed metrics, and careful handling of identity joins.

What you are asked to write (SQL prompt):

Frame this as metrics work for **Meta, Google**-scale surfaces (ads, product, or engagement — as the tables suggest).
queries(id, text, created_at). BigQuery: derive an anagram_key (lowercase, remove spaces, sort chars) using SPLIT/ARRAY/STRING and group keys with count >= 5.

Tables implied by the prompt:

  • queries(id, text, created_at)
  • anagram_key(lowercase, remove spaces, sort chars)

Engine: BigQuery — use its date, array, and approximate functions as documented.

2. Solution outline

  • Clarify out loud: result grain (one row per what?), join keys, time zone, and any ORDER BY / LIMIT / tie-breakers.
  • Map Hash Map Grouping to SQL: say the relational equivalent (e.g. hash map → GROUP BY + key; two pointers → ordered window + filter).
  • Arrays / UDFs: UNNEST and offsets for index; say when logic belongs in SQL vs a UDF.
  • Structure: CTEs (WITH) — one step per CTE; validate on a tiny slice (counts, nulls, duplicates).
Show SQL solution (BigQuery)
WITH normalized AS (
  SELECT
    id,
    text,
    LOWER(REGEXP_REPLACE(text, r'\s+', '')) AS text_norm
  FROM
    queries
),
with_chars AS (
  SELECT
    id,
    text,
    ARRAY(SELECT ch FROM UNNEST(SPLIT(text_norm, '')) AS ch ORDER BY ch) AS chars_sorted
  FROM
    normalized
),
anagram_keys AS (
  SELECT
    id,
    text,
    ARRAY_TO_STRING(chars_sorted, '') AS anagram_key
  FROM
    with_chars
),
counts AS (
  SELECT
    anagram_key,
    COUNT(*) AS group_count
  FROM
    anagram_keys
  GROUP BY
    anagram_key
  HAVING
    group_count >= 5
)
SELECT
  q.id,
  q.text,
  q.anagram_key,
  c.group_count
FROM
  anagram_keys AS q
JOIN
  counts AS c
USING (anagram_key)
ORDER BY
  anagram_key,
  q.id;