#208 - Implement Trie (Prefix Tree)
July 29, 2025
Implement Trie (Prefix Tree)
- Difficulty: Medium
- Topics: Hash Table, String, Design, Trie
- Link: https://leetcode.com/problems/implement-trie-prefix-tree/
Problem Description
A trie (pronounced as “try”) or prefix tree is a tree data structure used to efficiently store and retrieve keys in a dataset of strings. There are various applications of this data structure, such as autocomplete and spellchecker.
Implement the Trie class:
Trie()
Initializes the trie object.void insert(String word)
Inserts the stringword
into the trie.boolean search(String word)
Returnstrue
if the stringword
is in the trie (i.e., was inserted before), andfalse
otherwise.boolean startsWith(String prefix)
Returnstrue
if there is a previously inserted stringword
that has the prefixprefix
, andfalse
otherwise.
Example 1:
Input
["Trie", "insert", "search", "search", "startsWith", "insert", "search"]
[[], ["apple"], ["apple"], ["app"], ["app"], ["app"], ["app"]]
Output
[null, null, true, false, true, null, true]
Explanation
Trie trie = new Trie();
trie.insert("apple");
trie.search("apple"); // return True
trie.search("app"); // return False
trie.startsWith("app"); // return True
trie.insert("app");
trie.search("app"); // return True
Constraints:
1 <= word.length, prefix.length <= 2000
word
andprefix
consist only of lowercase English letters.- At most
3 * 104
calls in total will be made toinsert
,search
, andstartsWith
.
Solution
1. Problem Deconstruction
Technical Rewrite
A trie is a rooted tree structure for storing dynamic string sets with the following formal specifications:
- Node Definition: Each vertex contains:
- A mapping (child nodes, )
- A terminal flag
- Operations:
- Insert():
:
- Search():
:
- startsWith():
:
- Insert():
Beginner Rewrite
Imagine organizing words in a tree where each branch represents a letter. To store “apple”:
- Start at root → ‘a’ → ‘p’ → ‘p’ → ‘l’ → ‘e’ (mark ‘e’ as word end).
Operations: - Insert: Build letter-by-letter branches.
- Search: Follow letters to check if path exists AND last letter is marked as word end.
- startsWith: Follow letters to check if path exists (last letter doesn’t need to be word end).
Mathematical Rewrite
Let be the set of inserted words. Define trie where:
- : root node
- has
- (labeled edges)
Invariant: , path with .
Operations maintain this invariant via edge additions and updates.
Constraint Analysis
- :
- Limits worst-case per-op steps to 2000 → per op acceptable.
- Edge Case: Max-length words amplify memory consumption (node explosion).
- Lowercase letters only ():
- Permits fixed-size child arrays (26 pointers/node).
- Edge Case: Alphabet constraint simplifies child storage but wastes space for sparse nodes.
- Total calls :
- Worst-case: inserts → nodes.
- Edge Case: Memory explosion (12+ GB in Python) if no shared prefixes.
2. Intuition Scaffolding
Analogies
- Real-World (Postal System):
- Root = main post office.
- Child nodes = regional hubs (routed by zip code digits).
- = delivery destination exists.
- Why relevant: Hierarchical prefix matching mirrors zip code lookup.
- Gaming (RPG Skill Tree):
- Nodes = combat moves.
- Path = combo sequence.
- Search = exact combo exists.
- startsWith = partial combo valid.
- Math (Finite Automaton):
- Trie = DFA with states , alphabet .
- = accepting states.
- Search = lands in accepting state.
- startsWith = is a valid path.
Common Pitfalls
- Missing Terminal Flags:
- Insert(“apple”) → Search(“app”) returns incorrectly.
- Shared-Prefix Overwrite:
- Insert(“app”) after “apple” fails to mark ‘p’ as terminal.
- Case Sensitivity:
- Assuming uppercase support when problem specifies lowercase.
- Linear Child Search:
- Using lists instead of dicts/arrays → per char vs .
- Memory Bloat:
- Fixed-size arrays (26 children/node) waste bytes/node.
- Solution: Dictionary for sparse children.
3. Approach Encyclopedia
Brute Force (List Storage)
class BadTrie:
def __init__(self):
self.words = set() # Stores all inserted words
def insert(self, word):
self.words.add(word) # O(1) amortized
def search(self, word):
return word in self.words # O(n) per op
def startsWith(self, prefix):
for w in self.words: # O(n*L)
if w.startswith(prefix):
return True
return False
Complexity:
- Time: Insert , Search , StartsWith
- Space: (store all words)
Why Fail: - , → StartsWith worst-case per call → ops total.
Optimal (Trie with Dict Children)
Visualization (Insert “apple”, “app”):
root: {}
insert("apple"):
root → 'a' → Node1 (children={})
Node1 → 'p' → Node2
Node2 → 'p' → Node3
Node3 → 'l' → Node4
Node4 → 'e' → Node5 (isEnd=True)
insert("app"):
root → 'a' → Node1 → 'p' → Node2 → 'p' → Node3
set Node3.isEnd = True
Pseudocode:
class TrieNode:
children: dict # char → TrieNode
is_end: bool
class Trie:
__init__():
root = TrieNode()
insert(word):
node = root
for c in word:
if c not in node.children:
node.children[c] = TrieNode()
node = node.children[c]
node.is_end = True
search(word):
node = root
for c in word:
if c not in node.children:
return False
node = node.children[c]
return node.is_end
startsWith(prefix):
node = root
for c in prefix:
if c not in node.children:
return False
node = node.children[c]
return True # Path exists
Complexity Proof:
- Time: per op (each char processed once).
- Space: where .
- Worst-case: (no shared prefixes).
- Best-case: (all words share prefix).
4. Code Deep Dive
class TrieNode:
def __init__(self):
self.children = {} # Dict avoids fixed-size array waste
self.is_end = False # Marks word termination
class Trie:
def __init__(self):
self.root = TrieNode() # Dummy root (no char)
def insert(self, word: str) -> None:
node = self.root
for char in word:
# If child missing, create new node
if char not in node.children:
node.children[char] = TrieNode()
node = node.children[char] # Move down
node.is_end = True # Mark final node as word end
def search(self, word: str) -> bool:
node = self.root
for char in word:
if char not in node.children:
return False # Prefix broken
node = node.children[char]
return node.is_end # Must be exact word end
def startsWith(self, prefix: str) -> bool:
node = self.root
for char in prefix:
if char not in node.children:
return False # Prefix broken
node = node.children[char]
return True # Path exists (regardless of is_end)
Edge Case Handling:
- Duplicate Insertions (e.g., insert “app” twice):
insert
retraverses path, setsis_end=True
(idempotent).
- Substring Search (e.g., “app” after “apple”):
- Initial
search("app")
fails (is_end=False
at second ‘p’). - After inserting “app”,
is_end
set → returns true.
- Initial
- Max-Length Words:
- Loop runs 2000 times → no stack overflow (iterative).
5. Complexity War Room
Hardware-Aware Analysis:
- Time: ops × chars = steps → ~0.6 sec (Python).
- Space:
- Worst-case: nodes.
- Per node: Dict overhead ~240B + 1B bool → 241B/node.
- Total: GB (critical in constrained env).
- Optimization: Use
__slots__
or C++ for production.
Industry Comparison Table:
Approach | Time | Space | Readability | Interview Viability |
---|---|---|---|---|
Brute Force (List) | ★★★★★ | ❌ (Fails large ) | ||
Trie (Array) | ★★★☆☆ | ✅ (Alphabet known) | ||
Trie (Dict) | ★★★★☆ | ✅ (Optimal) |
6. Pro Mode Extras
Variants:
- Wildcard Search (LC 211):
- Modify
search
to handle ‘.’ via DFS/BFS on all children.
def search(self, word: str) -> bool: def dfs(node, i): if i == len(word): return node.is_end if word[i] == '.': for child in node.children.values(): if dfs(child, i+1): return True return False else: if word[i] not in node.children: return False return dfs(node.children[word[i]], i+1) return dfs(self.root, 0)
- Modify
- Compressed Trie:
- Merge chain of single-child nodes → store substrings.
- Cuts to .
Interview Cheat Sheet:
- First Mention: “Trie ops are time, space per insertion.”
- Edge Cases: “Handle empty strings? Constraints say .”
- Optimization Path:
- Dict children for memory.
__slots__
to reduce Python overhead.- C++ for pointer control.