#3 - Longest Substring Without Repeating Characters
Longest Substring Without Repeating Characters
- Difficulty: Medium
- Topics: Hash Table, String, Sliding Window
- Link: https://leetcode.com/problems/longest-substring-without-repeating-characters/
Problem Description
Given a string s
, find the length of the longest substring without duplicate characters.
Example 1:
Input: s = "abcabcbb"
Output: 3
Explanation: The answer is "abc", with the length of 3.
Example 2:
Input: s = "bbbbb"
Output: 1
Explanation: The answer is "b", with the length of 1.
Example 3:
Input: s = "pwwkew"
Output: 3
Explanation: The answer is "wke", with the length of 3.
Notice that the answer must be a substring, "pwke" is a subsequence and not a substring.
Constraints:
0 <= s.length <= 5 * 104
s
consists of English letters, digits, symbols and spaces.
Solution
1. Problem Deconstruction
Rewrite the Problem
-
Technical Definition
Given a strings
of lengthn
, identify the maximum lengthL
such that there exists a contiguous substrings[i:j]
(where0 ≤ i < j ≤ n
) where all characters in the substring are unique. Formally: -
Beginner Explanation
Find the longest continuous segment in a string where no character repeats. For example, in “apple”, “ap” (length=2) has no repeats, but “app” has a repeated ‘p’. The solution must efficiently scan the string even for very long inputs. -
Mathematical Formulation
Letf(i, j)
be a predicate indicating all characters in substrings[i..j)
are distinct. The goal is to compute:Alternatively, define a function
g(i) = \min \{ j \mid j > i \land \neg f(i, j) \}
. The solution reduces to:
Constraint Analysis
-
0 <= s.length <= 5 * 10⁴
:- Limitation: Rules out O(n²) brute-force approaches (25e⁹ operations worst-case). Requires O(n) or O(n log n).
- Edge Cases:
- Empty string → Output 0.
- Length=1 → Output 1 (trivial unique substring).
- No repeating characters → Entire string length (
L = n
).
-
Character Set (Letters, Digits, Symbols, Spaces):
- Limitation: Finite alphabet size (
|Σ| ≤ 256
for extended ASCII). Enables O(1) space tracking via fixed-size arrays. - Edge Cases:
- Non-letter characters (e.g., “a b c” has valid substring “a b c” if spaces are unique).
- Case sensitivity? Problem states “characters”, so ‘A’ ≠ ‘a’.
- Limitation: Finite alphabet size (
2. Intuition Scaffolding
Analogies
-
Real-World Metaphor (Conveyor Belt Manufacturing)
Imagine inspecting items on a conveyor belt. You need the longest sequence of unique items. When a duplicate appears, you discard all items from the start up to the duplicate. Track the maximum unique sequence observed. -
Gaming Analogy (RPG Quest Items)
Collect consecutive quest items without duplicates. If a duplicate item appears, drop all items collected after its first occurrence. The “inventory size” is the substring length. -
Math Analogy (Interval Injectivity)
Find the longest interval where the character-position mapping is injective. Equivalently, maximizej - i
while maintaining injectivity ofs
restricted to[i, j)
.
Common Pitfalls
- Naive Shrinking: Moving left pointer by 1 (not jump) after duplicate → O(n²) time.
- Global Tracking: Storing global character counts ignores window locality → fails for “abba”.
- Set-Only Approach: Using a set without indices → cannot compute optimal left jump.
- Edge Handling: Forgetting empty string or single-character cases.
- Character Encoding: Assuming only lowercase letters → fails for digits/symbols.
3. Approach Encyclopedia
Approach 1: Brute Force (Triple Nested Loop)
- Pseudocode:
max_len = 0 for start in range(len(s)): seen = set() for end in range(start, len(s)): if s[end] in seen: break seen.add(s[end]) max_len = max(max_len, end - start + 1) return max_len
- Complexity Proof:
- Time: O(n²) worst-case (no duplicates). Sum of arithmetic series:
- Space: O(min(n, |Σ|)) for the set.
- Time: O(n²) worst-case (no duplicates). Sum of arithmetic series:
- Visualization:
s = "abc" Start=0: [a] → [a,b] → [a,b,c] → L=3 Start=1: [b] → [b,c] → L=2 Start=2: [c] → L=1
Approach 2: Sliding Window (Hash Map Tracking)
- Pseudocode:
left = max_len = 0 last_seen = {} # char → last index for right, char in enumerate(s): if char in last_seen and last_seen[char] >= left: left = last_seen[char] + 1 # Jump left to after last duplicate last_seen[char] = right # Update last seen index max_len = max(max_len, right - left + 1) return max_len
- Complexity Proof:
- Time: O(n). Each element processed once.
last_seen
operations O(1). - Space: O(|Σ|) for
last_seen
. Since |Σ| ≤ 256 → O(1).
- Time: O(n). Each element processed once.
- Visualization (
s="abac"
):Step0: left=0, right=0 (a) → last_seen={a:0}, len=1 Step1: left=0, right=1 (b) → last_seen={a:0,b:1}, len=2 Step2: left=0, right=2 (a) → duplicate! left=1 → last_seen={a:2,b:1}, len=2 Step3: left=1, right=3 (c) → last_seen={a:2,b:1,c:3}, len=3
Approach 3: Sliding Window (Fixed-Size Array)
- Pseudocode:
left = max_len = 0 last_index = [-1] * 256 # Initialize with -1 for right in range(len(s)): ascii_val = ord(s[right]) if last_index[ascii_val] >= left: # Duplicate in current window left = last_index[ascii_val] + 1 last_index[ascii_val] = right # Update last index max_len = max(max_len, right - left + 1) return max_len
- Complexity Proof:
- Time: O(n) — single pass, constant-time array access.
- Space: O(1) — fixed-size array (256 integers).
- Visualization (
s="aab"
):Step0: right=0 (a) → last_index[97]=0, left=0, len=1 Step1: right=1 (a) → last_index[97]=0 ≥ left=0 → left=1, update last_index[97]=1, len=1 Step2: right=2 (b) → last_index[98]=-1 → len=2
4. Code Deep Dive
Optimal Solution (Array-Based Sliding Window)
def lengthOfLongestSubstring(s: str) -> int:
n = len(s)
if n == 0: # Edge: empty string
return 0
last_index = [-1] * 256 # Track last occurrence; -1 = not seen
left = max_len = 0
for right in range(n):
char_code = ord(s[right])
# If char seen in current window, jump left to last_index[char] + 1
if last_index[char_code] >= left:
left = last_index[char_code] + 1
last_index[char_code] = right # Update last seen position
current_len = right - left + 1
if current_len > max_len:
max_len = current_len
return max_len
Edge Case Handling
- Example 1 (
s="abcabcbb"
)- At
right=3
(‘a’):last_index[97]=0 ≥ left=0
→left=1
. - Window “bca” at
right=3
→len=3
. - Max remains 3 throughout.
- At
- Example 2 (
s="bbbbb"
)right=0
:last_index[98]=-1
→left=0
,len=1
.right=1
:last_index[98]=0 ≥ left=0
→left=1
,len=1
.- Output=1.
- Example 3 (
s="pwwkew"
)right=2
(‘w’):last_index[119]=1 ≥ left=0
→left=2
.- At
right=4
: Window “wke” →len=3
(max).
5. Complexity War Room
Hardware-Aware Analysis
- Time: O(n) → 5e⁴ operations ≈ 0.1ms (Python), negligible.
- Space: 256 × 4 bytes (int array) = 1KB → Fits in L1 CPU cache.
Industry Comparison Table
Approach | Time | Space | Readability | Interview Viability |
---|---|---|---|---|
Brute Force | O(n²) | O(1) | ★★★★★ | ❌ (Fails n=5e4) |
Sliding Window (Map) | O(n) | O( | Σ | ) |
Array Sliding | O(n) | O(1) | ★★★★☆ | ✅ Best |
6. Pro Mode Extras
Variants
-
At Most K Duplicates (LeetCode 340)
- Track character frequencies. Shrink window when distinct chars > k.
freq = defaultdict(int) left = distinct = max_len = 0 for right, char in enumerate(s): if freq[char] == 0: distinct += 1 freq[char] += 1 while distinct > k: freq[s[left]] -= 1 if freq[s[left]] == 0: distinct -= 1 left += 1 max_len = max(max_len, right - left + 1)
-
Longest Substring with At Least K Repeating (LeetCode 395)
- Divide and conquer on characters with frequency < k.
if len(s) < k: return 0 for char in set(s): if s.count(char) < k: return max(self.longestSubstring(t, k) for t in s.split(char)) return len(s)
Interview Cheat Sheet
- First Mention: Always state time/space complexity upfront.
- Sliding Window Template:
1. Initialize left=0, tracking structure, max_len. 2. Iterate right over [0, n): a. Update tracking with s[right]. b. While window invalid: i. Remove s[left] from tracking. ii. left++. c. Update max_len.
- Test Rigorously: Empty, all unique, all same, “abba”, “tmmzuxt”.