#169 - Majority Element
Majority Element
- Difficulty: Easy
- Topics: Array, Hash Table, Divide and Conquer, Sorting, Counting
- Link: https://leetcode.com/problems/majority-element/
Problem Description
Given an array nums
of size n
, return the majority element.
The majority element is the element that appears more than ⌊n / 2⌋
times. You may assume that the majority element always exists in the array.
Example 1:
Input: nums = [3,2,3]
Output: 3
Example 2:
Input: nums = [2,2,1,1,1,2,2]
Output: 2
Constraints:
n == nums.length
1 <= n <= 5 * 104
-109 <= nums[i] <= 109
Follow-up: Could you solve the problem in linear time and in O(1)
space?
Solution
1. Problem Deconstruction
Technical Restatement:
Given an array nums
of length n
, identify the element x
such that the frequency of x
in nums
strictly exceeds . The solution must efficiently determine this element under the guarantee of its existence, optimizing for time and space complexity.
Beginner-Friendly Explanation:
Imagine you have a list of numbers. One number appears in the list more than half the time. Your task is to find that number. For example, if the list has 7 numbers, the winning number must appear at least 4 times. The challenge is to find it as quickly as possible without using too much computer memory.
Mathematical Formulation:
Let be a multiset of size , and let denote the frequency of element in . The majority element satisfies:
This can be equivalently expressed as:
where is the indicator function.
Constraint Analysis:
n == nums.length
: The input size is explicitly defined, allowing precise complexity calculations.1 <= n <= 5 * 10^4
:- Limitation: Rules out O(n²) solutions (e.g., brute force checking all pairs would require ~1.25e9 operations worst-case, which is computationally infeasible).
- Edge Case: Arrays of size 1 must return the single element immediately.
-10^9 <= nums[i] <= 10^9
:- Limitation: Values can be very large or negative, making value-based indexing (e.g., using the number as a direct array index) impossible. Hash maps must be used instead.
- Edge Case: Negative numbers must be handled correctly in hash functions and comparisons.
2. Intuition Scaffolding
Analogies:
-
Real-World Metaphor (Election):
Think of the array as votes in an election. The majority element is the candidate who receives more than half the votes. The Boyer-Moore algorithm is like a ballot counter who notes a candidate’s lead, canceling out opposing votes pairwise. -
Gaming Analogy (Team Selection):
In a game where players choose factions, the majority faction is the one with over half the players. You can simulate a “battle” where two players from different factions cancel each other. The last faction standing is the majority. -
Math Analogy (Frequency Dominance):
The majority element’s frequency minus the sum of all other frequencies is at least 1. This imbalance allows a linear cancellation process to correctly identify the element.
Common Pitfalls:
- Sorting Misinterpretation: Assuming the median is always the majority element. While true, sorting is O(n log n), which is suboptimal.
- Hash Map Overcomplication: Using a hash map counts all frequencies, which is O(n) time but also O(n) space, missing the O(1) space challenge.
- Partial Cancellation Errors: Incorrectly implementing the cancellation logic by not resetting the count properly when candidates change.
- Unverified Candidates: In variants where the majority might not exist, failing to verify the candidate after the cancellation phase.
- Global Frequency Assumption: Assuming the entire array must be processed to identify the majority, whereas local tracking suffices.
3. Approach Encyclopedia
1. Brute Force (Check Every Element)
- Logic: For each element, scan the entire array to count its occurrences. Return the first element whose count exceeds n/2.
- Pseudocode:
for i from 0 to n-1: count = 0 for j from 0 to n-1: if nums[j] == nums[i]: count += 1 if count > n/2: return nums[i]
- Complexity: Time: O(n²) due to nested loops. Space: O(1).
- Visualization:
nums: [3, 2, 3] Check 3: count=2 → 2 > 1.5? ✓ → return 3
2. Sorting and Median Selection
- Logic: After sorting, the majority element must occupy the middle position (and potentially adjacent positions) due to its frequency dominance.
- Pseudocode:
sort(nums) return nums[n // 2] # Integer division for 0-indexing
- Complexity: Time: O(n log n) dominated by sorting. Space: O(1) if sorted in-place.
- Visualization:
Sorted: [2, 3, 3] → median at index 1 is 3
3. Hash Map Frequency Counting
- Logic: Traverse the array, counting occurrences of each element in a hash map. Then, find the key with count > n/2.
- Pseudocode:
count_map = {} for num in nums: count_map[num] = count_map.get(num, 0) + 1 if count_map[num] > n/2: return num
- Complexity: Time: O(n) for single pass. Space: O(n) for storing counts.
- Visualization:
nums: [2,2,1,1,1,2,2] Count: {2:4, 1:3} → return 2 when count reaches 4
4. Boyer-Moore Voting Algorithm (Optimal)
- Logic: Maintain a candidate and a count. Traverse the array: if count is 0, set current element as candidate. If current element equals candidate, increment count; else decrement. The majority element survives the cancellation.
- Pseudocode:
candidate = None count = 0 for num in nums: if count == 0: candidate = num count += (1 if num == candidate else -1) return candidate
- Complexity: Time: O(n) with one pass. Space: O(1).
- Visualization:
nums: [2,2,1,1,1,2,2] Step: [2] (candidate=2, count=1) [2,2] (candidate=2, count=2) [2,2,1] (candidate=2, count=1) [2,2,1,1] (candidate=2, count=0) [2,2,1,1,1] (candidate=1, count=1) [2,2,1,1,1,2] (candidate=1, count=0) [2,2,1,1,1,2,2] (candidate=2, count=1) Return 2
4. Code Deep Dive
Boyer-Moore Implementation (Python):
def majorityElement(nums):
candidate = None
count = 0
for num in nums:
if count == 0:
candidate = num
count += (1 if num == candidate else -1)
return candidate
Line-by-Line Annotations:
candidate = None
: Initializes the potential majority element.count = 0
: Tracks the net advantage of the candidate over other elements.for num in nums
: Iterates through each element exactly once.if count == 0:
: If the net advantage is zero, the current segment has no majority, so reassign candidate.count += (1 if num == candidate else -1)
: Adjusts the net advantage—increment for matches, decrement for mismatches.return candidate
: The algorithm ensures the majority element survives all cancellations.
Edge Case Handling:
- Example 1 (
[3,2,3]
):- Step 1: num=3, count=0 → candidate=3, count=1
- Step 2: num=2, count=1 → count=0 (cancel 3 and 2)
- Step 3: num=3, count=0 → candidate=3, count=1 → return 3
- Example 2 (
[2,2,1,1,1,2,2]
):- The visualization above shows how the candidate changes but ultimately corrects to 2.
- Single Element (
[5]
):- candidate=5, count=1 → returns 5 correctly.
5. Complexity War Room
Hardware-Aware Analysis:
- The Boyer-Moore algorithm uses two integer variables (
candidate
andcount
), consuming ~16 bytes on a 64-bit system. - For n=50,000, the loop has 50,000 iterations, which fits comfortably in L1/L2 cache due to sequential access, ensuring high performance.
Industry Comparison Table:
Approach | Time | Space | Readability | Interview Viability |
---|---|---|---|---|
Brute Force | O(n²) | O(1) | 10/10 | ❌ Fails large n |
Sorting | O(n log n) | O(1) | 9/10 | ✅ Acceptable |
Hash Map | O(n) | O(n) | 10/10 | ✅ Good |
Boyer-Moore | O(n) | O(1) | 7/10 | ✅ Optimal |
6. Pro Mode Extras
Variants:
- Majority Element II (LC 229): Find all elements appearing more than n/3 times.
- Solution: Extend Boyer-Moore with two candidates and two counters.
- Non-Guaranteed Majority: Verify the candidate by a second pass to count occurrences if existence isn’t guaranteed.
Interview Cheat Sheet:
- First Mention: Always state time and space complexity upfront.
- Boyer-Moore Insight: Describe it as “simulating pairwise cancellation between majority and non-majority elements.”
- Validation: Even if the problem assumes existence, note that in real-world scenarios, verification is needed.
- Trade-offs: Hash maps are more intuitive and acceptable unless O(1) space is explicitly required.