#242 - Valid Anagram
Valid Anagram
- Difficulty: Easy
- Topics: Hash Table, String, Sorting
- Link: https://leetcode.com/problems/valid-anagram/
Problem Description
Given two strings s
and t
, return true
if t
is an anagram of s
, and false
otherwise.
Example 1:
Input: s = “anagram”, t = “nagaram”
Output: true
Example 2:
Input: s = “rat”, t = “car”
Output: false
Constraints:
1 <= s.length, t.length <= 5 * 104
s
andt
consist of lowercase English letters.
Follow up: What if the inputs contain Unicode characters? How would you adapt your solution to such a case?
Solution
1. Problem Deconstruction
Rewrite the Problem
-
Technical:
Given two stringss
andt
, determine ift
is a permutation ofs
under these conditions:- (string lengths are equal)
- For every character (alphabet), the frequency
- Return
true
if conditions hold;false
otherwise.
-
Beginner:
“Check if two words have exactly the same letters in the same quantities, ignoring order. For example, ‘listen’ and ‘silent’ are anagrams because both use 1 L, 1 I, 1 S, 1 E, 1 N, 1 T.” -
Mathematical:
Let and be sequences of characters.
Define (frequency of in ).
Then, is an anagram of iff:
Constraint Analysis
- :
- Limits solutions to or better (brute force fails at → operations).
- Implies edge case: Early termination if .
- Lowercase English letters:
- Alphabet size → constant-space frequency counters possible.
- Hidden edge: Non-alphabet characters excluded (simplifies mapping).
- Follow-up (Unicode):
- unbounded → requires dynamic frequency storage (hash tables).
- Edge: Characters outside BMP (4-byte Unicode) must be handled.
2. Intuition Scaffolding
Analogies
- Real-World Metaphor:
“Like verifying two identical sets of Scrabble tiles. Each tile must appear in both sets with identical counts, regardless of arrangement.” - Gaming Analogy:
“In RPG inventory management, check if two loot bags contain the same items in identical quantities, ignoring order.” - Math Analogy:
“Multiset equality: , where denotes the multiset of characters.”
Common Pitfalls
- Length Mismatch Oversight:
Not checking first → wasted computation. - Case Sensitivity:
Forgetting to normalize case (though constraints specify lowercase). - Frequency Underflow:
Decrementing counts below zero without validation → false negatives. - Non-Existent Key Errors:
Accessing uninitialized hash table keys for Unicode characters. - Suboptimal Sorting:
Using sort when counting suffices for fixed alphabets.
3. Approach Encyclopedia
Approach 1: Brute Force (Naive Check)
- Pseudocode:
function isAnagram(s, t): if len(s) != len(t): return False for char in s: if count(char, s) != count(char, t): return False return True
- Complexity Proof:
- Time: → For each of chars in , scan and ( per char).
- Space: (no auxiliary storage).
- Visualization:
s: "abc" | t: "cba" Step 1: Count 'a' in s → 1, in t → 1 → OK Step 2: Count 'b' in s → 1, in t → 1 → OK Step 3: Count 'c' in s → 1, in t → 1 → OK → True
Approach 2: Sorting
- Pseudocode:
function isAnagram(s, t): return sorted(s) == sorted(t) // After length check
- Complexity Proof:
- Time: → Dominated by sorting two strings.
- Space: → Storage for sorted strings (non-in-place sort).
- Visualization:
s: "nagaram" → sorted: ['a','a','a','g','m','n','r'] t: "anagram" → sorted: ['a','a','a','g','m','n','r'] → True
Approach 3: Fixed-Size Frequency Array (Optimal for English)
- Pseudocode:
function isAnagram(s, t): if len(s) != len(t): return False count = [0] * 26 for char in s: count[ord(char) - ord('a')] += 1 for char in t: index = ord(char) - ord('a') count[index] -= 1 if count[index] < 0: // Early exit return False return True
- Complexity Proof:
- Time: → Two linear passes over and .
- Space: → (104 bytes for 26 ints).
- Visualization:
s: "rat" → count: [ a:1, r:1, t:1 ] t: "car" → 'c' → count[2] = 0-1 = -1 → return False
Approach 4: Hash Table (Optimal for Unicode)
- Pseudocode:
function isAnagram(s, t): if len(s) != len(t): return False freq = {} for char in s: freq[char] = freq.get(char, 0) + 1 for char in t: if char not in freq: // Key exists? return False freq[char] -= 1 if freq[char] == 0: del freq[char] // Optional cleanup return len(freq) == 0 // All keys exhausted?
- Complexity Proof:
- Time: → Each pass is per char (hash table operations).
- Space: → = unique chars in (worst-case for Unicode).
- Visualization:
s: "😊a" → freq: { '😊':1, 'a':1 } t: "a😊" → 'a': freq{'😊':1, 'a':0} → '😊': freq{} → len=0 → True
4. Code Deep Dive
Optimal Solution (Fixed Alphabet)
def isAnagram(s: str, t: str) -> bool:
# Edge: Length mismatch (immediate failure)
if len(s) != len(t): # O(1) time
return False
# Initialize frequency array for 'a'-'z'
count = [0] * 26 # O(26) space
# Build frequency map for s
for char in s: # O(n)
idx = ord(char) - ord('a') # Map char to 0-25
count[idx] += 1 # Increment count
# Validate against t
for char in t: # O(n)
idx = ord(char) - ord('a')
count[idx] -= 1 # Decrement count
# Underflow check: t has more 'char' than s
if count[idx] < 0: # Early exit
return False
# All counts balanced (no overflow possible due to length check)
return True
Edge Case Handling
- Example 1 (s=“anagram”, t=“nagaram”):
- Lengths equal (7).
- After processing
s
:a:3, n:1, g:1, r:1, m:1
. - Processing
t
: Decrements preserve counts → no negatives → returnsTrue
.
- Example 2 (s=“rat”, t=“car”):
- Lengths equal (3).
- After
s
:r:1, a:1, t:1
. - First char in
t
:'c'
→count[2] = 0-1 = -1
→ returnsFalse
(line 15).
5. Complexity War Room
Hardware-Aware Analysis
- Fixed-Array (English):
- → 100,000 iterations (2 passes).
- Memory: 26 ints = 208 bytes (Python int=8 bytes) → fits in L1 cache.
- Hash Table (Unicode):
- Worst-case: unique Unicode chars → 50,000 key-value pairs.
- Memory: → fits in L3 cache.
Industry Comparison Table
Approach | Time | Space | Readability | Interview Viability |
---|---|---|---|---|
Brute Force | 10/10 | ❌ Fails | ||
Sorting | 9/10 | ⚠️ Suboptimal | ||
Fixed Array | 8/10 | ✅ Optimal | ||
Hash Table | 7/10 | ✅ Unicode Ready |
6. Pro Mode Extras
Variants
-
Group Anagrams (LC 49):
- Problem: Group words by anagram families.
- Solution: Use sorted word as hash key →
groups = { sorted(word): [words] }
. - Code Snippet:
groups = {} for word in words: key = ''.join(sorted(word)) groups.setdefault(key, []).append(word)
-
Find All Anagrams (LC 438):
- Problem: Find starting indices of all anagrams of
p
ins
. - Solution: Sliding window with frequency array.
- Initialize frequency array for
p
. - Use two pointers to maintain window in
s
. - Update frequency counts incrementally when sliding.
- Initialize frequency array for
- Problem: Find starting indices of all anagrams of
Interview Cheat Sheet
- First Response:
“Check lengths → unequal means immediate failure.” - Complexity Trade-offs:
“For English, fixed array is optimal ( space). For Unicode, hash tables scale.” - Edge Cases:
- Length mismatch
- Empty strings ( → trivially anagrams)
- Identical strings (anagrams)
- Case sensitivity (if constraints didn’t specify lowercase)
- Optimization Insight:
“Early termination on frequency underflow saves unnecessary iterations.”