← ~/lc-grind/posts

#435 - Non-overlapping Intervals

June 16, 2025

Non-overlapping Intervals

Problem Description

Given an array of intervals intervals where intervals[i] = [starti, endi], return the minimum number of intervals you need to remove to make the rest of the intervals non-overlapping.

Note that intervals which only touch at a point are non-overlapping. For example, [1, 2] and [2, 3] are non-overlapping.

Example 1:


Input: intervals = [[1,2],[2,3],[3,4],[1,3]]
Output: 1
Explanation: [1,3] can be removed and the rest of the intervals are non-overlapping.

Example 2:


Input: intervals = [[1,2],[1,2],[1,2]]
Output: 2
Explanation: You need to remove two [1,2] to make the rest of the intervals non-overlapping.

Example 3:


Input: intervals = [[1,2],[2,3]]
Output: 0
Explanation: You don't need to remove any of the intervals since they're already non-overlapping.

Constraints:

  • 1 <= intervals.length <= 105
  • intervals[i].length == 2
  • -5 * 104 <= starti < endi <= 5 * 104

Solution

1. Problem Deconstruction

Technical Restatement
Given an array intervals where each element is a closed interval [start_i, end_i], determine the minimum number of interval removals required such that the remaining intervals are pairwise non-overlapping. Two intervals overlap if their intersection is non-empty, but intervals sharing only an endpoint (e.g., [1,2] and [2,3]) are non-overlapping. The solution must efficiently compute this for large inputs (up to 10510^5 intervals).

Beginner-Friendly Explanation
Imagine you have multiple time slots (e.g., meetings, events) that might overlap. Your goal is to cancel as few events as possible so that none of the remaining events overlap in time. Events that end exactly when another starts (like 2 PM to 3 PM followed by 3 PM to 4 PM) are allowed.

Mathematical Formulation
Let S={I1,I2,,In}S = \{I_1, I_2, \ldots, I_n\} be a set of intervals where Ii=[si,ei]I_i = [s_i, e_i] and si<eis_i < e_i. Find the largest subset TST \subseteq S such that for any distinct pair Ij,IkTI_j, I_k \in T:

ejskoreksje_j \leq s_k \quad \text{or} \quad e_k \leq s_j

The solution is nTn - |T|.

Constraint Analysis

  • 1intervals.length1051 \leq \text{intervals.length} \leq 10^5:
    • Rules out O(n2)O(n^2) algorithms (e.g., nested loops would process $\sim$10$^{10}$ operations, infeasible at scale).
    • Requires O(nlogn)O(n \log n) or better time complexity.
  • intervals[i].length=2\text{intervals[i].length} = 2:
    • Ensures uniform input structure; no need to handle jagged arrays.
  • 5×104starti<endi5×104-5 \times 10^4 \leq \text{start}_i < \text{end}_i \leq 5 \times 10^4:
    • Negative values don’t affect correctness but require signed integer handling.
    • Fixed value range implies sorting is feasible (O(nlogn)O(n \log n) dominates value-based constraints).
  • Edge Cases Implied:
    • All intervals identical (e.g., [[1,2],[1,2],[1,2]] → remove n1n-1 intervals).
    • Intervals fully contained within others (e.g., [[1,5],[2,3]] → remove the longer one for optimality).
    • Disjoint intervals (e.g., [[1,2],[3,4]] → no removal needed).

2. Intuition Scaffolding

Analogies

  1. Real-World Metaphor (Conference Room Booking):
    You manage a conference center. Each interval is a booking request. Maximize room utilization (number of bookings) by rejecting the fewest requests. Greedily accept bookings ending earliest to leave more time for others.
  2. Gaming Analogy (Timed Power-Ups):
    In a game, power-ups appear during intervals [si,ei][s_i, e_i]. Collecting one occupies you until eie_i. Maximize power-ups collected by always picking the one expiring soonest.
  3. Math Analogy (Set Theory):
    Maximize a chain of intervals Ii1,Ii2,I_{i_1}, I_{i_2}, \ldots where eiksik+1e_{i_k} \leq s_{i_{k+1}}. This is isomorphic to the “activity selection problem.”

Common Pitfalls

  1. Sorting by start time:
    Leads to suboptimal picks (e.g., [[1,10],[2,3],[3,4]] → keep [1,10], miss [2,3],[3,4]).
  2. Removing longest intervals first:
    Counterexample: [[1,4],[2,3],[3,4]] → removing [1,4] allows keeping two; removing [2,3] forces two removals.
  3. Ignoring endpoint equality:
    Mistakenly using >> instead of \geq for overlap checks breaks valid cases (e.g., [1,2],[2,3]).
  4. Counting overlaps directly:
    Overlap density ≠ minimal removals (e.g., three intervals overlapping at one point require two removals, not one).
  5. Dynamic programming with O(n2)O(n^2):
    Sorting by start and using DP for longest chain fails at n=105n=10^5 (quadratic time).

3. Approach Encyclopedia

Brute Force (Theoretical, Infeasible)

  • Idea: Enumerate all 2n2^n subsets, check if non-overlapping, track largest subset size.
  • Pseudocode:
    max_non_overlap = 0  
    for subset in all_subsets(intervals):  
        if is_non_overlapping(subset):  
            max_non_overlap = max(max_non_overlap, len(subset))  
    return n - max_non_overlap  
    
  • Complexity:
    • Time: O(2nn2)O(2^n \cdot n^2) (subset generation + pairwise overlap checks).
    • Space: O(n)O(n) for recursion/tracking.
  • Visualization (for [[1,2],[2,3]]):
    Subsets:  
      [] → valid → size=0  
      [[1,2]] → valid → size=1  
      [[2,3]] → valid → size=1  
      [[1,2],[2,3]] → valid → size=2  // Maximal  
    Removals = 2 - 2 = 0  
    

Greedy (Optimal)

  • Idea: Sort intervals by end time. Traverse sorted list, select interval if non-overlapping with last selected.
  • Why it Works: Earliest-end intervals maximize remaining time for future selections.
  • Pseudocode:
    sort intervals by end_i           # O(n log n)  
    count = 1                         # Keep first interval  
    last_end = intervals[0][1]  
    for i in range(1, n):  
        start_i, end_i = intervals[i]  
        if start_i >= last_end:        # Non-overlapping?  
            count += 1  
            last_end = end_i  
    return n - count                  # Removals needed  
    
  • Complexity Proof:
    • Sorting: O(nlogn)O(n \log n) (dominant term).
    • Single traversal: O(n)O(n).
    • Total time: O(nlogn)O(n \log n). Space: O(1)O(1) (in-place sort) or O(n)O(n) (if not).
  • Visualization (for [[1,3],[2,3],[3,4]] sorted to [[1,3],[2,3],[3,4]]):
    Step 0: last_end = -∞  
    [1,3]: 1 >= -∞ → keep, count=1, last_end=3  
    [2,3]: 2 < 3 → skip  
    [3,4]: 3 >= 3 → keep, count=2, last_end=4  
    Removals = 3 - 2 = 1  
    
    ASCII Diagram:
    Original: [1-----3]  [2-3]    [3---4]  
    Sorted:   [1-----3], [2-3], [3---4]  
    Kept:     [1-----3]           [3---4]  
    

4. Code Deep Dive

Optimal Solution (Python)

def eraseOverlapIntervals(intervals):  
    if not intervals:  
        return 0  
    intervals.sort(key=lambda x: x[1])  # Sort by end time  
    count = 1                           # First interval always kept  
    last_end = intervals[0][1]  
    for i in range(1, len(intervals)):  
        start, end = intervals[i]  
        if start >= last_end:            # Non-overlapping condition  
            count += 1  
            last_end = end  
    return len(intervals) - count       # Removals = total - kept  

Line-by-Line Analysis

  • Line 2: Handle empty input (constraints ensure n1n \geq 1, but defensive).
  • Line 3: Sort by endpoint. Why? Ensures we consider intervals in order of increasing end time.
  • Line 4: Initialize count to 1. Why? At least one interval can be kept.
  • Line 5: last_end tracks the endpoint of the last kept interval.
  • Lines 6-10: Loop through remaining intervals.
    • Line 8: Check if current interval starts after or at last_end. Why? Touching endpoints are allowed.
    • Lines 9-10: Update count and last_end if kept.
  • Line 11: Compute removals as total intervals minus kept count.

Edge Case Handling

  • Example 1 ([[1,2],[2,3],[3,4],[1,3]]):
    Sorted: [[1,2],[2,3],[1,3],[3,4]] (by end: 2,3,3,4).
    Keep [1,2] (count=1, last_end=2) → [2,3] (2≥2, count=2, last_end=3) → skip [1,3] (1<3) → keep [3,4] (3≥3). Removals=4-3=1.
  • Example 2 ([[1,2],[1,2],[1,2]]):
    Sorted: all end at 2. Keep first (count=1, last_end=2) → skip others (1<2). Removals=3-1=2.
  • Contained Intervals ([[1,5],[2,3],[3,4]]):
    Sorted: [[2,3],[3,4],[1,5]] (ends 3,4,5). Keep [2,3] → keep [3,4] (3≥3) → skip [1,5] (1<4). Removals=3-2=1.

5. Complexity War Room

Hardware-Aware Analysis

  • At n=105n=10^5:
    • Sorting: O(nlogn)105×17=1.7×106O(n \log n) \approx 10^5 \times 17 = 1.7 \times 10^6 operations (feasible in <100ms).
    • Memory: \sim 200 KB for intervals (each [start,end][start,end] = 8 bytes; 105×810^5 \times 8 bytes = 800 KB, fits in L2/L3 cache).

Approach Comparison

Approach Time Space Readability Interview Viability
Brute Force O(2nn2)O(2^n \cdot n^2) O(n)O(n) 8/10 ❌ Fails n>20n>20
DP w/ Sorting O(n2)O(n^2) O(n)O(n) 6/10 ❌ Fails n=105n=10^5
Greedy O(nlogn)O(n \log n) O(1)O(1) 9/10 ✅ Optimal

6. Pro Mode Extras

Variants

  1. Weighted Intervals (LC 1235):
    Each interval has a weight. Maximize total weight without overlaps.
    Solution: DP + binary search (after sorting), O(nlogn)O(n \log n).
  2. Multiple Rooms (LC 253):
    Find minimum rooms to host all meetings.
    Solution: Min-heap for end times or “chronological ordering”.
  3. Fixed Removal Locations:
    Return which intervals to remove.
    Solution: Track kept intervals, return complement set.

Interview Cheat Sheet

  • First Mention: “This is the classic interval scheduling problem solvable by greedy.”
  • Key Insight: “Sort by end time, then traverse selecting non-overlapping intervals.”
  • Complexity: Always state time (O(nlogn)O(n \log n)) and space (O(1)O(1)) upfront.
  • Edge Tests: Mention identical intervals and contained intervals.
  • Why Greedy?: “Earliest end maximizes remaining time—mathematically provable by exchange argument.”