← ~/lc-grind/posts

#424 - Longest Repeating Character Replacement

July 2, 2025

Longest Repeating Character Replacement

Problem Description

You are given a string s and an integer k. You can choose any character of the string and change it to any other uppercase English character. You can perform this operation at most k times.

Return the length of the longest substring containing the same letter you can get after performing the above operations.

Example 1:


Input: s = "ABAB", k = 2
Output: 4
Explanation: Replace the two 'A's with two 'B's or vice versa.

Example 2:


Input: s = "AABABBA", k = 1
Output: 4
Explanation: Replace the one 'A' in the middle with 'B' and form "AABBBBA".
The substring "BBBB" has the longest repeating letters, which is 4.
There may exists other ways to achieve this answer too.

Constraints:

  • 1 <= s.length <= 105
  • s consists of only uppercase English letters.
  • 0 <= k <= s.length

Solution

1. Problem Deconstruction

Technical Rewording

Given a string s (length n) of uppercase English letters and integer k, find the maximum length L of a contiguous substring where the number of characters not matching the most frequent character in that substring is ≤ k. Operations allow converting any character to another uppercase letter at most k times. The solution must efficiently handle constraints: 1 ≤ n ≤ 10^5, 0 ≤ k ≤ n.

Beginner-Friendly Explanation

You have a string of capital letters (e.g., “AABABBA”) and can change up to k letters to any other capital letter. Your goal is to find the longest contiguous segment that can be turned into identical letters (e.g., changing the middle ‘A’ to ‘B’ in “AABABBA” gives “AABBBBA”, creating a “BBBB” segment of length 4).

Mathematical Formulation

Let s[i:j] be a contiguous substring of s (length L = j - i + 1). For some character c in s[i:j], define freq(c) as its frequency. The condition is:

Lmaxcfreq(c)kL - \max_{c} \text{freq}(c) \leq k

Equivalently:

maxcfreq(c)Lk\max_{c} \text{freq}(c) \geq L - k

Maximize L over all possible i, j.

Constraint Analysis

  1. 1 ≤ s.length ≤ 10^5:

    • Eliminates O(n²) brute-force (10¹⁰ operations worst-case). Requires O(n log n) or O(n).
    • Edge: Large input stresses efficiency; algorithms must avoid nested loops.
  2. s is uppercase English letters:

    • Alphabet size |Σ| = 26 (constant). Allows O(n) solutions with fixed-size frequency arrays.
    • Edge: No Unicode/case sensitivity; simplifies hashing.
  3. 0 ≤ k ≤ s.length:

    • k = 0: Substring must be naturally uniform (e.g., “AAA” in “AAB” fails).
    • k ≥ substring length: Whole substring can be converted (output = substring length).
    • Edge: When k = 0, solution reduces to longest unmodified run.

2. Intuition Scaffolding

Analogies

  1. Real-World (Factory Conveyor Belt):
    Boxes of 26 colors move on a belt. You can repaint ≤ k boxes. Find the longest segment where repainting makes all boxes the same color by targeting the most frequent color.

  2. Gaming (RPG Quest):
    A path has 26 symbol types. Use k “shape-shift potions” to make the longest contiguous path uniform. Optimal strategy: target the dominant symbol.

  3. Math (Maximizing Frequency-Density):
    Find interval [i, j] maximizing j - i + 1 such that the frequency-density ( \max \text{freq}(c) / L ) \geq 1 - k/L.

Common Pitfalls

  1. Assuming Global Character Target:
    Different windows may target different characters (e.g., “AABA” targets ‘A’; “BABB” targets ‘B’).

  2. Static Sliding Window Start:
    Valid windows can start anywhere (not just index 0).

  3. Ignoring Frequency Updates:
    Removing/adding characters requires updating frequency counts.

  4. Handling k = 0 Incorrectly:
    Requires contiguous identical characters (no changes allowed).

  5. Outdated Max-Frequency:
    Using a historical max after removals overestimates validity (e.g., window “ABAB” with historical max=3 invalidly satisfies 4-3=1 ≤ k=1).


3. Approach Encyclopedia

Brute Force

Pseudocode:

max_len = 0
for i in range(len(s)):
    freq = [0] * 26
    for j in range(i, len(s)):
        freq[char_index(s[j])] += 1
        max_freq = max(freq)
        if (j - i + 1) - max_freq <= k:
            max_len = max(max_len, j - i + 1)
return max_len

Complexity:

  • Time: O(n²) (n choices for i, n for j).
    • Worst-case: Σ_{i=0}^{n-1} Σ_{j=i}^{n-1} 1 = n(n+1)/2 → O(n²).
  • Space: O(1) (fixed 26-element array).
    Visualization:
s = "AABABBA", k=1
i=0: [A], [AA], [AAB] (invalid: 3-2=1≤1? ✓), [AABA] (✓), [AABAB] (✗)
i=1: [A], [AB] (✓), ... 

Optimal: Sliding Window with Rescanned Max-Frequency

Intuition: Expand window rightwards. If window becomes invalid, shrink leftwards. After each change, rescan frequency array to update current max-frequency.
Pseudocode:

left = 0, max_len = 0, freq = [0]*26
for right in range(len(s)):
    freq[right_char] += 1
    current_max = max(freq)  # O(26)
    while (window_size - current_max) > k:
        freq[left_char] -= 1
        left += 1
        current_max = max(freq)  # O(26)
    max_len = max(max_len, window_size)
return max_len

Complexity Proof:

  • Time: O(n ⋅ |Σ|) = O(26n) = O(n). Each element added/removed once. Each operation scans 26 elements.
  • Space: O(|Σ|) = O(1).
    Visualization:
s = "AABABBA", k=1
Step0: [A]        → max_freq=1, valid (1-1=0≤1)
Step1: [AA]       → max_freq=2, valid (2-2=0≤1)
Step2: [AAB]      → max_freq=2, valid (3-2=1≤1)
Step3: [AABA]     → max_freq=3, valid (4-3=1≤1) → max_len=4
Step4: [AABAB]    → max_freq=3, invalid (5-3=2>1)
  Shrink: [ABAB]  → rescan: max_freq=2 → 4-2=2>1? ✗ → shrink again
          [BAB]   → max_freq=2? ✗ → rescan: max_freq=1 → 3-1=2>1? ✗ → shrink
          [AB]    → valid → continue

4. Code Deep Dive

def characterReplacement(s: str, k: int) -> int:
    n = len(s)
    left = 0
    max_len = 0
    freq = [0] * 26  # Frequency count for 'A' to 'Z'
    
    for right in range(n):
        # Add s[right] to window
        r_idx = ord(s[right]) - ord('A')
        freq[r_idx] += 1
        
        # Compute current max frequency in window (O(26))
        current_max = max(freq)
        
        # Shrink window from left if invalid
        while (right - left + 1) - current_max > k:
            l_idx = ord(s[left]) - ord('A')
            freq[l_idx] -= 1
            left += 1
            current_max = max(freq)  # Rescan after removal
        
        # Update max valid window length
        window_size = right - left + 1
        max_len = max(max_len, window_size)
    
    return max_len

Line-by-Line Annotations:

  • Lines 1-4: Initialize pointers, result, and frequency array.
  • Line 7: Convert character to index (0-25).
  • Line 8: Increment frequency of right character.
  • Line 11: Scan frequency array to find current max (critical for validity check).
  • Lines 14-17: While window is invalid, remove left character, increment left, and rescan max-frequency.
  • Lines 20-21: Update max_len with current valid window size.

Edge Case Handling:

  • k = 0 (Example: s="AAB"):
    • Window shrinks immediately when non-uniform (e.g., [AAB]current_max=2, 3-2=1>0 → shrink to [AB] → invalid → shrink to [B]).
  • k ≥ window size (Example: s="AB", k=2):
    • Entire string is valid; max_len = n (no shrinking occurs).
  • All Identical (Example: s="AAAA", k=0):
    • current_max equals window size; condition always holds → max_len = n.

5. Complexity War Room

Hardware-Aware Analysis:

  • At n = 10^5, frequency scans (26 per operation) use ≈ 2.6M operations (negligible on modern CPUs).
  • Memory: 26 integers (208 bytes), fits in L1 cache.

Industry Comparison:

Approach Time Space Readability Interview Viability
Brute Force O(n²) O(1) 9/10 ❌ (Fails large n)
Sliding Window* O(n) O(1) 8/10 ✅ (Optimal)

*Without rescan: O(n) but may use historical max (advanced).


6. Pro Mode Extras

Variants:

  1. Two Changes Allowed (LC 123 Style): Track separate counts for two transactions, extending to 2D DP.
  2. Circular String: Concatenate s + s and constrain window to length ≤ n.
  3. Lowercase Included: Double alphabet size (52), but still O(n).

Interview Cheat Sheet:

  • First Mention Complexity: “This runs in O(n) time with O(1) space due to fixed alphabet.”
  • Key Insight: “Validity depends on window_length - max_frequency ≤ k; sliding window expands/shrinks to maintain this.”
  • Corner Cases: Always test k=0, k ≥ n, and uniform strings.