Design Twitter · SecondChair.dev

Day 62 — Design Twitter

Coding problem

Problem	Design Twitter
LeetCode ID(s)	LeetCode #355
Difficulty	Medium
Pattern	OOP + Heap
Company tags	Meta
Suggested time	25m

Solution outline (coding)

Store per user: list of (timestamp, tweetId); global follow map user → set of user ids.
getNewsFeed: merge k sorted lists with a heap (lazy), or precompute top 10.

Time complexity: O(10 log k) or similar for feed with k follows; post is O(1) append.

Space complexity: O(tweets stored).

Show Python solution

import heapq
import itertools
from typing import List
from collections import defaultdict, deque


class Twitter:
  def __init__(self):
    self.tweets = defaultdict(deque)
    self.follows = defaultdict(set)
    self.counter = itertools.count(step=-1)

  def postTweet(self, userId: int, tweetId: int) -> None:
    self.tweets[userId].appendleft((next(self.counter), tweetId))
    if len(self.tweets[userId]) > 10:
      self.tweets[userId].pop()

  def getNewsFeed(self, userId: int) -> List[int]:
    heap = []
    for uid in {userId, *self.follows[userId]}:
      for pair in list(self.tweets[uid])[:10]:
        heapq.heappush(heap, pair)
        if len(heap) > 10:
          heapq.heappop(heap)
    return [t for _, t in sorted(heap, reverse=True)]

  def follow(self, followerId: int, followeeId: int) -> None:
    if followerId != followeeId:
      self.follows[followerId].add(followeeId)

  def unfollow(self, followerId: int, followeeId: int) -> None:
    self.follows[followerId].discard(followeeId)


# Input:  postTweet, follow, getNewsFeed
# Output: recent tweet ids for user + followees

SQL interview practice

1. Interview question

Companies / track: Meta

Meta: expect event logging, ads / integrity, and social product analytics—deduping, sessionization, and window functions are frequent themes.

What you are asked to write (SQL prompt):

Frame this as metrics work for **Meta**-scale surfaces (ads, product, or engagement — as the tables suggest).
social_feed(user_id, tweet_id, ts). BigQuery: design a materialized view/summary table to support top-N personalized feed queries with reasonable latency and cost.

Tables implied by the prompt:

social_feed(user_id, tweet_id, ts)

Engine: BigQuery — use its date, array, and approximate functions as documented.

2. Solution outline

Clarify out loud: result grain (one row per what?), join keys, time zone, and any ORDER BY / LIMIT / tie-breakers.
Map OOP + Heap to SQL: say the relational equivalent (e.g. hash map → GROUP BY + key; two pointers → ordered window + filter).
Cost: selective columns, partition pruning, avoid SELECT * when tables are huge.
Structure: CTEs (WITH) — one step per CTE; validate on a tiny slice (counts, nulls, duplicates).

Show SQL solution (BigQuery)

Main query

SELECT user_id, ARRAY_AGG(STRUCT(tweet_id, ts) ORDER BY ts DESC LIMIT 20) AS recent_tweets
FROM social_feed
GROUP BY user_id;

Day 62: Design Twitter