#76 - Minimum Window Substring
Minimum Window Substring
- Difficulty: Hard
- Topics: Hash Table, String, Sliding Window
- Link: https://leetcode.com/problems/minimum-window-substring/
Problem Description
Given two strings s
and t
of lengths m
and n
respectively, return the minimum window substring of s
such that every character in t
(including duplicates) is included in the window. If there is no such substring, return the empty string ""
.
The testcases will be generated such that the answer is unique.
Example 1:
Input: s = "ADOBECODEBANC", t = "ABC"
Output: "BANC"
Explanation: The minimum window substring "BANC" includes 'A', 'B', and 'C' from string t.
Example 2:
Input: s = "a", t = "a"
Output: "a"
Explanation: The entire string s is the minimum window.
Example 3:
Input: s = "a", t = "aa"
Output: ""
Explanation: Both 'a's from t must be included in the window.
Since the largest window of s only has one 'a', return empty string.
Constraints:
m == s.length
n == t.length
1 <= m, n <= 105
s
andt
consist of uppercase and lowercase English letters.
Follow up: Could you find an algorithm that runs in O(m + n)
time?
Solution
1. Problem Deconstruction
Technical Restatement:
Given strings s
(length m
) and t
(length n
), find the minimal-length contiguous substring w
of s
such that for every character c
in t
, the frequency of c
in w
is ≥ its frequency in t
. If no such w
exists, return ""
. The solution must handle m, n ≤ 1e5
efficiently.
Beginner-Friendly Restatement:
You have a long string s
and a shorter string t
. Find the smallest part of s
that contains all letters from t
in the same or greater quantities. For example, if t
has two 'a’s, the part of s
must have at least two 'a’s. If impossible, return an empty string.
Mathematical Formulation:
Let:
- and be input strings.
- = frequency of in .
- = frequency of in substring .
Find () minimizing such that:
If no such exist, return .
Constraint Analysis:
m, n ≤ 1e5
:- Limits solutions to or better; is infeasible ( operations).
- Edge: When , return
""
immediately (window cannot contain characters).
- Uppercase/lowercase English letters:
- 52 distinct characters → use fixed-size arrays (size 128) for per-operation frequency tracking.
- Answer is unique:
- Simplifies output; no need to handle multiple valid minimal windows.
- :
- Edge: → impossible.
- Edge: has duplicates (e.g., =“aa”) → window must satisfy character counts, not just presence.
2. Intuition Scaffolding
Analogies:
- Real-World Metaphor:
Collecting rare coins (t
) from a conveyor belt (s
). Find the shortest belt segment containing all coins in required quantities. - Gaming Analogy:
Side-scroller game: Pass through a tunnel (s
) and collect power-ups (t
). Find the minimal tunnel segment containing all power-ups in exact quantities. - Math Analogy:
Minimize interval where vector dominates component-wise.
Common Pitfalls:
- Ignoring duplicates: Treating =“aa” as requiring only one ‘a’.
- Suboptimal checks: Verifying window validity by full frequency scan per window () without incremental tracking.
- Premature termination: Stopping after the first valid window without shrinking to find smaller ones.
- Pointer errors: Shrinking the window beyond validity or moving
left
incorrectly. - Overlooking : Not handling the trivial impossibility case early.
3. Approach Encyclopedia
Brute Force (Naive)
Pseudocode:
min_len = INF
for i in range(len(s)):
freq_window = [0]*128
for j in range(i, len(s)):
freq_window[ord(s[j])] += 1
if is_valid(freq_window, t_counts): # Check all c: freq_window[c] >= t_counts[c]
if (j-i+1) < min_len:
min_len = j-i+1
min_start = i
return s[min_start:min_start+min_len] if min_len != INF else ""
Complexity Proof:
- Time: → . For , (infeasible).
- Space: (fixed-size arrays).
Visualization:
s: A D O B E C O D E B A N C
t: A B C
Brute force checks:
[0:0] "A" → ✗
[0:1] "AD" → ✗
...
[0:5] "ADOBEC" → ✓ (len=6)
[1:5] "DOBEC" → ✗
...
Optimal: Sliding Window
Pseudocode:
if n > m: return ""
t_counts = [0]*128; required = 0
for c in t: t_counts[ord(c)] += 1
for count in t_counts:
if count > 0: required += 1 // Distinct chars in t
window_counts = [0]*128
left = formed = 0; min_len = INF
for right in range(m):
c = s[right]; window_counts[ord(c)] += 1
if window_counts[ord(c)] == t_counts[ord(c)]:
formed += 1 // Satisfy condition for c
while formed == required: // Valid window
min_len = min(min_len, right-left+1)
if (right-left+1) == min_len: min_left = left
d = s[left]; window_counts[ord(d)] -= 1
if window_counts[ord(d)] < t_counts[ord(d)]:
formed -= 1 // Break condition
left += 1 // Shrink window
return s[min_left:min_left+min_len] if min_len != INF else ""
Complexity Proof:
- Time: .
- : Compute
t_counts
andrequired
. - : Each character enters/exits window once → operations.
- : Compute
- Space: (two size-128 arrays).
Visualization:
s: A D O B E C O D E B A N C
t: A B C (t_counts: A:1, B:1, C:1; required=3)
Step 0: [A] → formed=1 (A satisfied)
...
Step 5: [ADOBEC] → formed=3 (valid), min_len=6
Shrink: [DOBEC] → formed=2 (A missing) → stop
Step 10: [DOBECODEBA] → formed=2
Step 11: [DOBECODEBAN] → formed=2
Step 12: [DOBECODEBANC] → formed=3 (valid)
Shrink:
[OBECODEBANC] → formed=3 → min_len=11
...
[BANC] → formed=3 → min_len=4
Remove 'B' → formed=2 → stop
Final: "BANC" (len=4)
4. Code Deep Dive
Optimal Solution (Python):
def minWindow(s: str, t: str) -> str:
m, n = len(s), len(t)
if n > m:
return ""
t_counts = [0] * 128
for c in t:
t_counts[ord(c)] += 1
required = sum(1 for count in t_counts if count > 0)
window_counts = [0] * 128
left = formed = 0
min_len = float('inf')
min_left = 0
for right in range(m):
c = s[right]
idx = ord(c)
window_counts[idx] += 1
if window_counts[idx] == t_counts[idx]:
formed += 1
while formed == required and left <= right:
if right - left + 1 < min_len:
min_len = right - left + 1
min_left = left
d = s[left]
idx_d = ord(d)
window_counts[idx_d] -= 1
if window_counts[idx_d] < t_counts[idx_d]:
formed -= 1
left += 1
return "" if min_len == float('inf') else s[min_left:min_left+min_len]
Line-by-Line Annotations:
if n > m: return ""
: Early exit if impossible.t_counts
initialization: Stores frequency of each char int
.required = ...
: Counts distinct characters int
.- Sliding window loop (
right
pointer):window_counts[ord(c)] += 1
: Includes[right]
in window.if window_counts[...] == t_counts[...]: formed += 1
: Tracks when a char’s count is satisfied.
- Inner loop (shrinking
left
pointer):if right-left+1 < min_len
: Update minimal valid window.window_counts[ord(d)] -= 1
: Removes[left]
from window.if window_counts[...] < t_counts[...]: formed -= 1
: Breaks condition if removal invalidates a char.
- Return result: Returns minimal window or
""
.
Edge Case Handling:
- Example 3 (
s="a", t="aa"
):n>m
→ return""
(line 1). - Example 2 (
s="a", t="a"
):right=0
:window_counts['a']=1 == t_counts['a']=1
→formed=1
.- Enter inner loop:
min_len=1
,min_left=0
. - Remove ‘a’:
formed=0
→ return"a"
.
- Duplicate handling: Track exact counts (e.g.,
t="aa"
requireswindow_counts['a']≥2
).
5. Complexity War Room
Hardware-Aware Analysis:
- At , :
- Time: operations (feasible).
- Space: Two 128-integer arrays → 1KB (fits in L1 cache).
Industry Comparison Table:
Approach | Time | Space | Readability | Interview Viability |
---|---|---|---|---|
Brute Force | 9/10 | ❌ (Fails large ) | ||
Sliding Window | 8/10 | ✅ Optimal |
6. Pro Mode Extras
Variants:
- At Most Distinct Characters:
Find minimal window containing ≤ distinct chars (LC 340). Useformed
= count of distinct chars. - With Mismatches:
Minimal window where for all . - Multiple Transactions:
Extend to transactions (LC 123, 188).
Interview Cheat Sheet:
- First Step: Clarify constraints (case sensitivity? empty inputs?).
- Key Insight: Sliding window + frequency tracking +
formed
counter. - Mention: time, space (fixed alphabet).
- Verify: Check and with duplicates.
- Test: Examples 1-3 during coding.