Remove Duplicates from Sorted Array II

Difficulty: Medium
Topics: Array, Two Pointers
Link: https://leetcode.com/problems/remove-duplicates-from-sorted-array-ii/

Problem Description

Given an integer array nums sorted in non-decreasing order, remove some duplicates in-place such that each unique element appears at most twice. The relative order of the elements should be kept the same.

Since it is impossible to change the length of the array in some languages, you must instead have the result be placed in the first part of the array nums. More formally, if there are k elements after removing the duplicates, then the first k elements of nums should hold the final result. It does not matter what you leave beyond the first k elements.

Return k after placing the final result in the first k slots of nums.

Do not allocate extra space for another array. You must do this by modifying the input array in-place with O(1) extra memory.

Custom Judge:

The judge will test your solution with the following code:


int[] nums = [...]; // Input array
int[] expectedNums = [...]; // The expected answer with correct length

int k = removeDuplicates(nums); // Calls your implementation

assert k == expectedNums.length;
for (int i = 0; i < k; i++) {
    assert nums[i] == expectedNums[i];
}

If all assertions pass, then your solution will be accepted.

Example 1:


Input: nums = [1,1,1,2,2,3]
Output: 5, nums = [1,1,2,2,3,_]
Explanation: Your function should return k = 5, with the first five elements of nums being 1, 1, 2, 2 and 3 respectively.
It does not matter what you leave beyond the returned k (hence they are underscores).

Example 2:


Input: nums = [0,0,1,1,1,1,2,3,3]
Output: 7, nums = [0,0,1,1,2,3,3,_,_]
Explanation: Your function should return k = 7, with the first seven elements of nums being 0, 0, 1, 1, 2, 3 and 3 respectively.
It does not matter what you leave beyond the returned k (hence they are underscores).

Constraints:

1 <= nums.length <= 3 * 104
-104 <= nums[i] <= 104
nums is sorted in non-decreasing order.

Solution

1. Problem Deconstruction

Technical Restatement
Given a sorted integer array nums in non-decreasing order, implement an in-place algorithm that removes duplicates such that each distinct element appears at most twice. The relative order of elements must be preserved. The solution must modify the input array such that the first k elements contain the valid result, returning k. No auxiliary data structures may be used; space complexity must be $O(1)$ .

Beginner-Friendly Explanation
You have a list of numbers that are already sorted from smallest to largest. Your task is to clean up this list so that no number appears more than two times in a row. You must do this by rearranging the list itself without creating a new list. The first part of the list should contain the cleaned-up version, and you need to return how many numbers are in this cleaned-up section. The numbers beyond that point don’t matter.

Mathematical Formulation
Let $nums[0..n-1]$ represent the input array of length $n$ . We seek the largest integer $k$ and a modified array such that:

$\forall i \in [0, k-2]: nums[i] \leq nums[i+1]$ (sorted order preserved)
$\forall x \in \mathbb{Z}: \sum_{i=0}^{k-1} \mathbb{1}_{nums[i]=x} \leq 2$ (at most two duplicates per element)
The first $k$ elements satisfy conditions 1 and 2, while elements beyond index $k-1$ are irrelevant.

Constraint Analysis

$1 \leq \text{nums.length} \leq 3 \times 10^4$ : The array can be large (up to 30,000 elements), ruling out $O(n^2)$ solutions. Linear or near-linear time complexity is required.
$-10^4 \leq \text{nums[i]} \leq 10^4$ : Values are bounded integers. This doesn’t directly impact the algorithm but confirms that integer comparisons are efficient.
Non-decreasing order: The sorted property is critical. It means duplicates are always consecutive, allowing efficient tracking without hashing or searching.

Hidden Edge Cases

All elements identical (e.g., $[1,1,1,1]$ ) → Output $k=2$
Single element → Output $k=1$
No duplicates (e.g., $[1,2,3]$ ) → Output $k=3$
All elements distinct except one triple → Output $k=n-1$
Empty array (though constraint ensures $n \geq 1$ )

2. Intuition Scaffolding

Analogies

Real-World Metaphor (Stream Filtering): Imagine a conveyor belt of colored balls sorted by color. You’re allowed to let at most two balls of the same color pass through. As the balls come, you decide which ones to keep, physically shifting them forward on the belt. The extra balls fall off the end.
Gaming Analogy (Inventory Management): In a sorted inventory list, you can’t hold more than two of any item. You must compact your inventory by shifting items left, overwriting excess duplicates.
Math Analogy (Sequence Trimming): Given a non-decreasing sequence, define a function $f(i)$ that counts consecutive occurrences of a value. For any $i$ , if $f(i) > 2$ , skip overwriting until $f(i) \leq 2$ .

Common Pitfalls

Using Extra Space: Attempting to create a new array violates the in-place requirement.
Global Counting: Counting duplicates for each value globally (instead of consecutively) is inefficient and unnecessary due to sortedness.
Overwriting Too Early: Shifting elements immediately upon finding a duplicate disrupts the array structure and causes loss of data.
Two-Pointer Misuse: Incorrectly updating the slow pointer without tracking occurrence count.
Boundary Errors: Not handling the start and end of the array correctly, leading to off-by-one errors.

3. Approach Encyclopedia

Brute Force (Not Viable)
Inefficiently shift elements for every excess duplicate.
Pseudocode:

function removeDuplicates(nums):
    n = length(nums)
    i = 0
    while i < n:
        count = 1
        j = i + 1
        while j < n and nums[j] == nums[i]:
            count += 1
            if count > 2:
                shift all elements from j to end left by 1
                n -= 1  # Reduce effective length
            else:
                j += 1
        i = j
    return n

Complexity: Time $O(n^2)$ due to shifting. Space $O(1)$ .
Visualization:

Initial: [1,1,1,2,2,3]
i=0, count=1 → j=1: count=2 → j=2: count=3 > 2 → shift [3,2,2,3] left: [1,1,2,2,3,3]
New array: [1,1,2,2,3,3] → n=5

Optimal Approach (Two Pointers with State Tracking)
Use a slow pointer to write and a fast pointer to read. Track the count of consecutive duplicates.
Pseudocode:

function removeDuplicates(nums):
    slow = 0
    count = 1
    for fast in range(1, len(nums)):
        if nums[fast] == nums[slow]:
            count += 1
        else:
            count = 1
        if count <= 2:
            slow += 1
            nums[slow] = nums[fast]
    return slow + 1

Complexity: Time $O(n)$ : single pass. Space $O(1)$ : two pointers and a counter.
Visualization:

Step-by-step for [1,1,1,2,2,3]:
slow=0, count=1
fast=1: nums[1]==1 → count=2 ≤2 → slow=1, nums[1]=1
fast=2: nums[2]==1 → count=3 >2 → skip
fast=3: nums[3]==2 → count=1 ≤2 → slow=2, nums[2]=2
fast=4: nums[4]==2 → count=2 ≤2 → slow=3, nums[3]=2
fast=5: nums[5]==3 → count=1 ≤2 → slow=4, nums[4]=3
Result: [1,1,2,2,3] and k=5

4. Code Deep Dive

Annotated Optimal Solution

def removeDuplicates(nums):
    slow = 0  # Points to the last valid element
    count = 1  # Counts consecutive duplicates for current value
    n = len(nums)
    for fast in range(1, n):
        if nums[fast] == nums[slow]:
            count += 1
        else:
            count = 1  # Reset for new value
        if count <= 2:
            slow += 1
            nums[slow] = nums[fast]
    return slow + 1

Line-by-Line Explanation

slow = 0: The first element is always valid.
count = 1: The first element has count 1.
Loop fast from 1 to n-1: Iterate through each element.
if nums[fast] == nums[slow]: Check if current value same as last valid value.
count += 1 or count = 1: Update consecutive count.
if count <= 2: If valid (at most two duplicates), increment slow and copy value.
return slow + 1: Number of valid elements is slow+1.

Edge Case Handling

All duplicates [1,1,1,1]:
fast=1: count=2 → slow=1, copy.
fast=2: count=3 → skip.
fast=3: count=4 → skip.
Returns slow+1=2.
Single element [5]: Loop not entered, returns 1.
No duplicates [1,2,3]:
Each fast resets count to 1, so slow increments each time. Returns 3.

5. Complexity War Room

Hardware-Aware Analysis

The algorithm runs in $O(n)$ time with $O(1)$ space.
For $n=3 \times 10^4$ , the loop has 30,000 iterations. On a modern CPU (10^9 operations/sec), this takes ~0.03 ms.
Memory: Only three integer variables (slow, fast, count), using ~12 bytes. The array fits in L1/L2 cache (30k elements * 4 bytes = 120 KB).

Industry Comparison Table

Approach	Time	Space	Readability	Interview Viability
Brute Force	$O(n^2)$	$O(1)$	High	❌ Fails large $n$
Hash Map (Extra Space)	$O(n)$	$O(n)$	Medium	❌ Violates in-place
Two Pointers	$O(n)$	$O(1)$	High	✅ Optimal

6. Pro Mode Extras

Variants

Remove duplicates allowing at most m duplicates:
Generalize the count check to if count <= m.
Remove duplicates from unsorted array:
Requires $O(n)$ space for hashing or $O(n \log n)$ time for sorting.
Remove duplicates with no consecutive duplicates (LC 80):
Similar but with m=2.

Interview Cheat Sheet

Mention: “This is a two-pointer problem with state tracking.”
Emphasize: “The sorted property is key to tracking duplicates consecutively.”
Watch For: Off-by-one errors in pointer updates.
Test: All same elements, two duplicates, no duplicates.

#80 - Remove Duplicates from Sorted Array II