Design Add and Search Words Data Structure

Difficulty: Medium
Topics: String, Depth-First Search, Design, Trie
Link: https://leetcode.com/problems/design-add-and-search-words-data-structure/

Problem Description

Design a data structure that supports adding new words and finding if a string matches any previously added string.

Implement the WordDictionary class:

WordDictionary() Initializes the object.
void addWord(word) Adds word to the data structure, it can be matched later.
bool search(word) Returns true if there is any string in the data structure that matches word or false otherwise. word may contain dots '.' where dots can be matched with any letter.

Example:


Input
["WordDictionary","addWord","addWord","addWord","search","search","search","search"]
[[],["bad"],["dad"],["mad"],["pad"],["bad"],[".ad"],["b.."]]
Output
[null,null,null,null,false,true,true,true]

Explanation
WordDictionary wordDictionary = new WordDictionary();
wordDictionary.addWord("bad");
wordDictionary.addWord("dad");
wordDictionary.addWord("mad");
wordDictionary.search("pad"); // return False
wordDictionary.search("bad"); // return True
wordDictionary.search(".ad"); // return True
wordDictionary.search("b.."); // return True

Constraints:

1 <= word.length <= 25
word in addWord consists of lowercase English letters.
word in search consist of '.' or lowercase English letters.
There will be at most 2 dots in word for search queries.
At most 104 calls will be made to addWord and search.

Solution

Design Add and Search Words Data Structure - Comprehensive Solution

1. Problem Deconstruction (500+ words)

1.1 Rewrite the Problem

Technical Version:
We must implement a WordDictionary class maintaining a dynamic collection of strings $S$ over alphabet $\Sigma = \{a,b,\dots,z\}$ . The class exposes two operations:

addWord(w): Insert string $w$ into $S$ , where $|w| \leq 25$ and $w \in \Sigma^*$ .
search(p): Return true iff $\exists s \in S$ such that $|s| = |p|$ and $\forall i \in [0,|p|-1], (p[i] = s[i]) \lor (p[i] = \cdot)$ , where $\cdot$ denotes a wildcard matching any $\sigma \in \Sigma$ .

Beginner Version:
Create a word bank that stores words. You can add new words, and later check if a word (or pattern) exists. The special twist: when searching, you can use dots (.) to mean “any letter”. For example, if “cat” is stored, searching for “.at” should succeed because the dot can match ‘c’.

Mathematical Formalization:
Let $\Sigma = \{a,\dots,z\}$ , $\Sigma' = \Sigma \cup \{\cdot\}$ , and $S \subseteq \Sigma^*$ . Define:

$addWord(w): S \leftarrow S \cup \{w\}$
$search(p): \text{return } \exists s \in S \text{ such that } \forall i \in [0,|p|-1], p[i] = \cdot \text{ or } p[i] = s[i]$

Constraints imply:

$|w|, |p| \in [1, 25]$
$count_{\cdot}(p) \leq 2$ for search queries
Total operations $\leq 10^4$

1.2 Constraint Analysis

1. 1 <= word.length <= 25

Limits trie depth to 25, making operations $O(1)$ in length terms.
Maximum total characters stored: $10^4 \times 25 = 2.5\times10^5$ , fitting comfortably in memory (~few MB).
Edge Cases: Single-letter words and max-length words must be handled efficiently. Search patterns of differing lengths must immediately fail.

2. word in addWord consists of lowercase English letters

Alphabet size fixed at 26, enabling array-based trie children of size 26.
No special characters to escape, simplifying parsing.
Edge Cases: None beyond standard lowercase validation.

3. word in search consist of '.' or lowercase English letters

Search patterns are in $\Sigma'$ , requiring wildcard handling.
Exact-match optimization possible when no dots present.
Edge Cases: Patterns like “…” (all dots) must search all words of that length.

4. There will be at most 2 dots in word for search queries

Critical optimization hint: worst-case branching factor $26^2 = 676$ , manageable.
Enables brute-force pattern generation as viable fallback.
Edge Cases: Two dots at distant positions may still require checking many combinations.

5. At most 10^4 calls will be made to addWord and search

Total operations modest, allowing $O(NL)$ approaches where $N$ is words per length.
Worst-case: all $10^4$ operations could be searches with 2 dots → $10^4 \times 676 \times 25 \approx 1.69\times10^8$ operations, potentially acceptable.
Hidden Edge: Memory management becomes important with many words; trie nodes should be space-efficient.

2. Intuition Scaffolding

2.1 Three Analogies

1. Real-World Metaphor (Library Catalog)
Imagine a library where each book title is stored. addWord is like acquiring a new book and indexing it. search is like a patron asking “Do you have a book where the title matches ‘a’?” (underscores as wildcards). The librarian must check all titles of correct length, seeing if letters match known positions.

2. Gaming Analogy (Word Puzzle Power-Ups)
In games like Scrabble with blank tiles (wildcards), you build a dictionary of valid words. When you play “C_T”, the game checks if any stored word (CAT, COT, CUT, etc.) matches. The data structure must quickly answer these partial-match queries during timed turns.

3. Mathematical Analogy (Hamming Distance with Wildcards)
Each word is a vector in $\mathbb{Z}_{26}^L$ . Search pattern defines constraints: some coordinates fixed, others free. We ask: does the constrained subspace intersect our set $S$ ? With ≤2 free coordinates, subspace volume is ≤676, making exhaustive search feasible.

2.2 Common Pitfalls (5+ Examples)

Global Min/Max Fallacy
False Approach: Store only shortest/longest words to bound searches.
Why it Fails: Search pattern length must match exactly; length mismatches immediately fail.
Hash Set with Full Pattern Generation
False Approach: For each search with $k$ dots, generate all $26^k$ strings and check hash set.
Why it Fails: With $k=2$ it’s acceptable (676 combos), but generalizes poorly. Also, generating combinations for every search is wasteful if trie can prune early.
Trie with Greedy Dot Matching
False Approach: When encountering ‘.’, pick any child arbitrarily to continue.
Why it Fails: The correct match might require exploring multiple branches; greedy choice may miss valid words.
Length-Grouped Linear Scan Without Indexing
False Approach: Group words by length, then linearly check each against pattern.
Why it Fails: With $10^4$ words of same length, each search does $10^4 \times 25 = 2.5\times10^5$ comparisons. With $10^4$ searches → $2.5\times10^9$ operations, too slow.
Precomputing All Dot Variations
False Approach: For each added word, generate all patterns with dots and store in hash map.
Why it Fails: A word of length $L$ has $\sum_{k=0}^{L} \binom{L}{k}26^{L-k}$ possible patterns → exponential blowup.

3. Approach Encyclopedia

Approach 1: Brute Force Hash Set with Pattern Generation

Core Idea: Store words in hash set. For search without dots, direct lookup. For dots, generate all concrete strings matching pattern and test membership.

Pseudocode:

class WordDictionary:
    def __init__(self):
        self.words_set = set()
    
    def addWord(self, word):
        self.words_set.add(word)
    
    def search(self, pattern):
        if '.' not in pattern:
            return pattern in self.words_set
        
        dot_positions = [i for i,ch in enumerate(pattern) if ch == '.']
        # Generate all letter combinations for dot positions
        for letters in product('abcdefghijklmnopqrstuvwxyz', repeat=len(dot_positions)):
            candidate = list(pattern)
            for pos, letter in zip(dot_positions, letters):
                candidate[pos] = letter
            if ''.join(candidate) in self.words_set:
                return True
        return False

Complexity Proof:

Time:
- addWord: $O(L)$ for hashing
- search: Let $k = \text{dot count} \leq 2$ . Generating $26^k$ candidates each of length $L$ gives $O(26^k \cdot L)$ . With $k \leq 2$ , $O(676L) = O(L)$ constant factor.
Space: $O(NL)$ storing all words.

Visualization:

Pattern: ".a."
Dot positions: [0,2]
Generate: aaa, aab, aac, ..., zaz (26^2 = 676 candidates)
Check each against hash set: O(1) per check

Approach 2: Length-Grouped Linear Scan

Core Idea: Group words by length in lists. Search only compares with words of matching length.

Pseudocode:

class WordDictionary:
    def __init__(self):
        from collections import defaultdict
        self.words_by_len = defaultdict(list)
    
    def addWord(self, word):
        self.words_by_len[len(word)].append(word)
    
    def search(self, pattern):
        L = len(pattern)
        if L not in self.words_by_len:
            return False
        for word in self.words_by_len[L]:
            match = True
            for i in range(L):
                if pattern[i] != '.' and pattern[i] != word[i]:
                    match = False
                    break
            if match:
                return True
        return False

Complexity Proof:

Time:
- addWord: $O(L)$ to append to list
- search: $O(M \cdot L)$ where $M$ = number of words of length $L$ . Worst-case $M = N = 10^4$ , so $O(10^4 \cdot 25) = O(2.5\times10^5)$ per search.
Space: $O(NL)$

Visualization:

Words added: [bad, dad, mad] → Group: {3: [bad, dad, mad]}
Search ".ad":
  Compare with "bad": .vs b? ✓, a vs a ✓, d vs d ✓ → Match found

Approach 3: Trie with Recursive DFS (Optimal)

Core Idea: Trie stores prefixes efficiently. For dots, recursively explore all children at that level.

Pseudocode:

class TrieNode:
    def __init__(self):
        self.children = {}
        self.is_end = False

class WordDictionary:
    def __init__(self):
        self.root = TrieNode()
    
    def addWord(self, word):
        node = self.root
        for ch in word:
            if ch not in node.children:
                node.children[ch] = TrieNode()
            node = node.children[ch]
        node.is_end = True
    
    def search(self, pattern):
        return self._dfs(pattern, 0, self.root)
    
    def _dfs(self, pattern, idx, node):
        if idx == len(pattern):
            return node.is_end
        
        ch = pattern[idx]
        if ch != '.':
            if ch not in node.children:
                return False
            return self._dfs(pattern, idx+1, node.children[ch])
        else:
            for child in node.children.values():
                if self._dfs(pattern, idx+1, child):
                    return True
            return False

Complexity Proof:

Time:
- addWord: $O(L)$ traversing/creating $L$ nodes
- search: Worst-case when pattern is all dots → explores all paths in trie of depth $L$ . With full trie, $O(26^L)$ catastrophic, but our trie is sparse. With ≤2 dots, explores ≤ $26^2$ branches at dot levels: $O(26^2 \cdot L) = O(676L)$
Space: $O(NL)$ for trie nodes, each node stores up to 26 references.

Visualization:

Trie:
      root
    /   |   \
   b    d    m
   a    a    a
   d    d    d

Search ".ad":
  At root: '.' → explore children b,d,m
    Child b: pattern "ad" → 'a' matches, go to 'a'
      At 'a': pattern "d" → 'd' matches, go to 'd'
        At 'd': pattern "" → is_end=True ✓

Approach 4: Trie with BFS Iterative Search

Core Idea: Use queue to avoid recursion depth limits (though depth ≤25 makes recursion safe).

Pseudocode:

def search(self, pattern):
    queue = deque([(self.root, 0)])  # (node, index)
    while queue:
        node, idx = queue.popleft()
        if idx == len(pattern):
            if node.is_end:
                return True
            continue
        ch = pattern[idx]
        if ch == '.':
            for child in node.children.values():
                queue.append((child, idx+1))
        else:
            if ch in node.children:
                queue.append((node.children[ch], idx+1))
    return False

Complexity: Same as recursive approach.

4. Code Deep Dive

Optimal Solution: Trie with Recursive Search

class TrieNode:
    def __init__(self):
        # Dictionary mapping character to child node
        # Alternative: array of size 26 for faster access but more memory
        self.children = {}
        # Boolean marking if a word ends at this node
        self.is_end = False

class WordDictionary:
    def __init__(self):
        # Root node of trie (empty string)
        self.root = TrieNode()

    def addWord(self, word: str) -> None:
        """Inserts word into trie"""
        node = self.root
        for char in word:
            # Create child node if path doesn't exist
            if char not in node.children:
                node.children[char] = TrieNode()
            # Move deeper into trie
            node = node.children[char]
        # Mark completion of word
        node.is_end = True

    def search(self, word: str) -> bool:
        """Searches for word or pattern with dots"""
        return self._search_helper(word, 0, self.root)
    
    def _search_helper(self, pattern: str, idx: int, node: TrieNode) -> bool:
        # Base case: processed all characters
        if idx == len(pattern):
            return node.is_end  # Must be exact word end
        
        char = pattern[idx]
        
        # Case 1: concrete letter
        if char != '.':
            if char not in node.children:
                return False  # No matching path
            # Recursively search deeper
            return self._search_helper(pattern, idx + 1, node.children[char])
        
        # Case 2: wildcard '.'
        # Explore all possible children
        for child_node in node.children.values():
            if self._search_helper(pattern, idx + 1, child_node):
                return True
        # No child produced a match
        return False

Line-by-Line Analysis:

TrieNode.__init__: Each node contains a dictionary (hash map) for children. Using dict instead of size-26 array saves space when trie is sparse.
WordDictionary.__init__: Initializes empty trie with root node representing empty string.
addWord lines 16-23:
- node = self.root: Start traversal at root.
- Loop through each character: ensures path exists, creates nodes as needed.
- node.is_end = True: Marks terminal node after last character.
search lines 26-28: Public method that initiates recursive search starting at index 0 and root node.
_search_helper lines 30-48:
- Lines 32-34: Base case when pattern fully consumed; returns true only if current node marks a word ending.
- Lines 36-40: Concrete character case: if no matching child, fail immediately (pruning).
- Lines 43-47: Wildcard case: iterate through all children, recursively searching. Returns early on first success.

Edge Case Handling:

Empty pattern: Not allowed by constraints (length ≥1), but if called, returns root.is_end (false unless empty string added).
Pattern longer than any word: Recursion will eventually fail when no child exists for concrete character, or when dots have no children to explore.
Multiple dots: Handled by recursive exploration at each dot.
Duplicate words: addWord recreates existing path; is_end set to True (idempotent).
Very long words (25 chars): Recursion depth 25 is safe (Python recursion limit typically 1000).

Example Walkthrough (from problem statement):

addWord("bad"):
  root → 'b' → 'a' → 'd' (is_end=True)
addWord("dad"):
  root → 'd' → 'a' → 'd' (is_end=True)
addWord("mad"): similar
search("pad"):
  root.children has 'b','d','m' but not 'p' → immediate false
search("bad"):
  root→'b'→'a'→'d', is_end=True → true
search(".ad"):
  At root: char='.' → explore all children:
    child 'b': recurse with "ad" from node 'b':
      'a' matches child 'a', recurse with "d":
        'd' matches child 'd', end, is_end=True → return true
search("b.."):
  root→'b' (concrete), then pattern "..":
    At node 'b': char='.' → explore children ('a' only):
      child 'a': pattern ".":
        char='.' → explore children ('d'):
          child 'd': pattern "", is_end=True → true

5. Complexity War Room

Hardware-Aware Analysis

Memory: With $2.5\times10^5$ $2.5 \times 1 0^{5}$ total characters, each trie node stores:
- Python object overhead: ~56 bytes
- Dictionary: ~72 bytes + entries (per char)
- Each entry: 8 bytes key + 8 bytes value pointer
- Estimated: 150 bytes/node × 250k nodes = 37.5 MB (acceptable)
Cache Performance: Trie nodes allocated randomly, causing pointer chasing. Depth ≤25 means ≤25 cache misses per traversal.
Operation Count: Worst-case $10^4$ searches with 2 dots:
Each search explores ≤676 branches × 25 steps = 16,900 operations.
Total: $1.69\times10^8$ operations.
Python performs ~ $10^7$ operations/second → ~16 seconds worst-case (borderline but acceptable given constraints).

Industry Comparison Table

Approach	Time (add)	Time (search)	Space	Readability	Interview Viability
Brute Force + Hash Set	$O(L)$	$O(26^kL)$ with k≤2	$O(NL)$	9/10	✅ Good for k≤2
Length-Grouped Linear	$O(L)$	$O(M\cdot L)$	$O(NL)$	8/10	❌ Fails large M
Trie + Recursive	$O(L)$	$O(26^kL)$ pruned	$O(NL)$	7/10	✅ Recommended
Trie + Iterative	$O(L)$	$O(26^kL)$	$O(NL)$	6/10	✅ Avoids recursion

Decision: Trie with recursive DFS offers best trade-off: efficient pruning, clean implementation, and handles constraints optimally.

6. Pro Mode Extras

Variants Section

Variant 1: Unlimited Dots (LC 211 original)
Remove the ≤2 dots constraint. Our trie solution still works but worst-case becomes $O(26^L)$ for pattern “…”. Optimization: pre-group by length, then for each length- $L$ group, build trie. Search becomes $O(26^{min(k,d)})$ where $d$ is trie depth at that position.

Variant 2: Support Delete Operation
Add delete(word) method. Implement with reference counting: each node tracks how many words use it. On delete, decrement counts and prune nodes with count 0.

Variant 3: Prefix Search (Auto-complete)
Add startsWith(prefix) returning all words with given prefix. Traverse to prefix node, then DFS collect all words in subtree.

Variant 4: Regular Expression Support
Allow patterns with * (zero or more), + (one or more). Requires converting to NFA/DFA or backtracking search with memoization.

Interview Cheat Sheet

Key Points to Mention:

Always analyze time/space complexity first.
Discuss trade-offs: hash set vs trie vs length-grouping.
Highlight how constraint ≤2 dots enables certain optimizations.
Consider both recursive and iterative implementations.

Common Interview Questions:

“How would you handle case-insensitive words?”
“What if words could contain digits or hyphens?”
“How to make it thread-safe?”

Optimization Opportunities:

Store words of same length in separate tries to reduce branching.
For ≤2 dots, generate all concrete patterns and use hash set.
Use array of size 26 for children if memory not concern.

Final Implementation Choice:
Trie with recursive DFS offers best balance of simplicity and performance for given constraints.

#211 - Design Add and Search Words Data Structure

Design Add and Search Words Data Structure

Problem Description

Solution

Design Add and Search Words Data Structure - Comprehensive Solution

1. Problem Deconstruction (500+ words)

1.1 Rewrite the Problem

1.2 Constraint Analysis

2. Intuition Scaffolding

2.1 Three Analogies

2.2 Common Pitfalls (5+ Examples)

3. Approach Encyclopedia

Approach 1: Brute Force Hash Set with Pattern Generation

Approach 2: Length-Grouped Linear Scan

Approach 3: Trie with Recursive DFS (Optimal)

Approach 4: Trie with BFS Iterative Search

4. Code Deep Dive

5. Complexity War Room

Hardware-Aware Analysis

Industry Comparison Table

6. Pro Mode Extras

Variants Section

Interview Cheat Sheet