Evaluate Division

Difficulty: Medium
Topics: Array, String, Depth-First Search, Breadth-First Search, Union Find, Graph, Shortest Path
Link: https://leetcode.com/problems/evaluate-division/

Problem Description

You are given an array of variable pairs equations and an array of real numbers values, where equations[i] = [Ai, Bi] and values[i] represent the equation Ai / Bi = values[i]. Each Ai or Bi is a string that represents a single variable.

You are also given some queries, where queries[j] = [Cj, Dj] represents the jth query where you must find the answer for Cj / Dj = ?.

Return the answers to all queries. If a single answer cannot be determined, return -1.0.

Note: The input is always valid. You may assume that evaluating the queries will not result in division by zero and that there is no contradiction.

**Note:**The variables that do not occur in the list of equations are undefined, so the answer cannot be determined for them.

Example 1:


Input: equations = [["a","b"],["b","c"]], values = [2.0,3.0], queries = [["a","c"],["b","a"],["a","e"],["a","a"],["x","x"]]
Output: [6.00000,0.50000,-1.00000,1.00000,-1.00000]
Explanation: 
Given: a / b = 2.0, b / c = 3.0
queries are: a / c = ?, b / a = ?, a / e = ?, a / a = ?, x / x = ? 
return: [6.0, 0.5, -1.0, 1.0, -1.0 ]
note: x is undefined => -1.0

Example 2:


Input: equations = [["a","b"],["b","c"],["bc","cd"]], values = [1.5,2.5,5.0], queries = [["a","c"],["c","b"],["bc","cd"],["cd","bc"]]
Output: [3.75000,0.40000,5.00000,0.20000]

Example 3:


Input: equations = [["a","b"]], values = [0.5], queries = [["a","b"],["b","a"],["a","c"],["x","y"]]
Output: [0.50000,2.00000,-1.00000,-1.00000]

Constraints:

1 <= equations.length <= 20
equations[i].length == 2
1 <= Ai.length, Bi.length <= 5
values.length == equations.length
0.0 < values[i] <= 20.0
1 <= queries.length <= 20
queries[i].length == 2
1 <= Cj.length, Dj.length <= 5
Ai, Bi, Cj, Dj consist of lower case English letters and digits.

Solution

1. Problem Deconstruction

Technical Restatement

Given a directed weighted graph representation of variable relationships where:

Nodes represent string variables
Directed edges A → B have weight values[i] representing A/B = values[i]
Implicit reverse edges B → A exist with weight 1/values[i]
Must compute division results for query pairs by finding valid paths and multiplying edge weights
Return -1.0 for undefined variables or disconnected components

Beginner-Friendly Version

Imagine you have a list of conversion rates between different currencies. For example:

“2 dollars = 1 euro” means dollar/euro = 2.0
“3 euros = 1 pound” means euro/pound = 3.0

If someone asks “How many dollars per pound?”, you’d multiply: dollars/euro × euros/pound = 2.0 × 3.0 = 6.0

Some currencies might not be connected (like “yen” not mentioned), so you can’t convert between them.

Mathematical Formulation

Given:

Set of equations: $E = \{(A_i, B_i)\}$ with values $V = \{v_i\}$ where $\frac{A_i}{B_i} = v_i$
Set of queries: $Q = \{(C_j, D_j)\}$

Find: $\frac{C_j}{D_j} = \prod_{k=1}^{m} w_k$ where $w_k$ are edge weights along path $C_j \rightarrow \cdots \rightarrow D_j$

If no path exists or variables undefined: $\frac{C_j}{D_j} = -1$

Constraint Analysis

Constraint	Impact Analysis	Edge Cases
`1 <= equations.length <= 20`	Small enough for exponential algorithms	Minimal equations (1) vs maximum (20)
`equations[i].length == 2`	Consistent bipartite structure	No self-loops in input data
`1 <= Ai.length, Bi.length <= 5`	Short variable names, hash efficiency	Single character vs 5-character names
`values.length == equations.length`	1:1 mapping guaranteed	No missing or extra conversion factors
`0.0 < values[i] <= 20.0`	No zero/negative division issues	Very small (>0) vs large (20.0) values
`1 <= queries.length <= 20`	Query count manageable	Single query optimization possible
`Cj.length, Dj.length <= 5`	Consistent variable naming	Unknown variables easily detectable
Hidden Constraints	Algorithmic Implications	Test Considerations
No division by zero	All denominators valid in queries	Self-query (a/a) always = 1.0
No contradiction	Graph is consistent	All paths between nodes give same result
Variables may be undefined	Must track existence	Queries with undefined variables

2. Intuition Scaffolding

Real-World Metaphor: Currency Exchange Network

Think of variables as different currencies and values as exchange rates. If:

USD/EUR = 0.8 and EUR/GBP = 0.9
Then USD/GBP = USD/EUR × EUR/GBP = 0.8 × 0.9 = 0.72

The graph represents this currency conversion network where finding exchange rates becomes path finding.

Gaming Analogy: Crafting Recipe Progression

In games like Minecraft or Factorio, recipes form dependency chains:

2 Wood → 1 Plank
4 Planks → 1 Crafting Table
Query: “How much Wood per Crafting Table?” = 2 × 4 = 8

Unknown materials represent disconnected components in the crafting graph.

Math Analogy: Matrix of Multiplicative Relationships

Each equation $\frac{A}{B} = v$ creates relationships in a multiplicative group. The problem reduces to solving:

M \cdot \vec{x} = \vec{b}

where M represents the coefficient matrix of the ratio relationships.

Common Pitfalls

Global Min/Max Fallacy: Trying to find “cheapest conversion” rather than exact multiplicative path
Bidirectional Edge Confusion: Forgetting that A/B = v implies B/A = 1/v
Path Existence Assumption: Assuming all variables are connected when they form separate components
Self-Query Oversight: Missing that a/a = 1.0 regardless of equations
Floating Point Precision: Accumulating rounding errors in long multiplication chains
Early Termination: Stopping BFS/DFS at first solution when multiple paths should give same result
Variable Existence Check: Not verifying both query variables exist in the equation set

3. Approach Encyclopedia

Approach 1: Brute Force Path Exploration

Algorithm: BruteForceEvaluateDivision
Input: equations, values, queries
Output: results array

1. Build adjacency list from equations and values
   - For each (A, B, value): add A→B:value and B→A:1/value
   
2. For each query (C, D):
   a. If C or D not in graph: result ← -1.0
   b. Else if C == D: result ← 1.0
   c. Else:
        visited ← set()
        result ← DFS(C, D, 1.0, visited)
        If result not found: return -1.0

3. DFS(current, target, product, visited):
   a. If current == target: return product
   b. Mark current visited
   c. For each neighbor with weight:
        If neighbor not visited:
            result ← DFS(neighbor, target, product × weight, visited)
            If result ≠ -1: return result
   d. Return -1.0

Complexity Proof:

Time: $O(Q \cdot (V + E)^{D})$ $O (Q \cdot (V + E)^{D})$ where D = maximum depth
- Worst case: $O(20 \cdot 40^{20})$ with 20 queries and chain length 20
Space: $O(V + E)$ for graph + $O(D)$ recursion depth

Visualization:

Equations: a/b=2, b/c=3

Graph:
a → b (2.0)    b → a (0.5)
b → c (3.0)    c → b (0.333)

Query a/c:
Start: a (product=1.0)
  → b (1.0×2.0=2.0)
    → c (2.0×3.0=6.0) ✓

Approach 2: BFS with Path Tracking

Algorithm: BFSEvaluateDivision
Input: equations, values, queries
Output: results array

1. Build graph as adjacency list
2. For each query (C, D):
   a. If C or D not in graph: -1.0
   b. If C == D: 1.0
   c. Else:
        queue ← [(C, 1.0)], visited ← {C}
        While queue not empty:
            (node, product) ← dequeue
            If node == D: return product
            For each neighbor with weight:
                If neighbor not visited:
                    visited.add(neighbor)
                    enqueue(neighbor, product × weight)
        Return -1.0

Complexity Proof:

Time: $O(Q \cdot (V + E))$ = $O(20 \cdot 40)$ = $O(800)$ worst case
Space: $O(V + E)$ for graph + $O(V)$ for BFS queue

Visualization:

Query a/c:
Queue: [(a,1.0)]
Process a: neighbors b(2.0) → Queue: [(b,2.0)]
Process b: neighbors a(0.5), c(3.0)
  Skip a (visited), Add c → Queue: [(c,6.0)]
Process c: Found target! Return 6.0

Approach 3: Floyd-Warshall All-Pairs (Optimal)

Algorithm: FloydWarshallEvaluateDivision
Input: equations, values, queries  
Output: results array

1. Collect all unique variables → list nodes
2. Create distance matrix dist[n][n] with:
   - dist[i][i] = 1.0
   - dist[i][j] = known value if edge i→j exists
   - Else dist[i][j] = -1.0 (unknown)
   
3. For k from 0 to n-1:
      For i from 0 to n-1:
          For j from 0 to n-1:
              If dist[i][k] ≠ -1 and dist[k][j] ≠ -1:
                 If dist[i][j] == -1:
                    dist[i][j] = dist[i][k] × dist[k][j]

4. For each query (C, D):
      i = indexOf(C), j = indexOf(D)
      Return dist[i][j] if both exist, else -1.0

Complexity Proof:

Time: $O(V^3 + Q)$ where V ≤ 40 → $O(40^3 + 20)$ = $O(64020)$
Space: $O(V^2)$ = $O(1600)$ for distance matrix

Mathematical Derivation: Floyd-Warshall works because division relationships are transitive: If $\frac{A}{B} = x$ and $\frac{B}{C} = y$ then $\frac{A}{C} = x \times y$

The algorithm computes: $dist[i][j] = \min_k(dist[i][k] \times dist[k][j])$ but since all paths give same result (no contradictions), it’s just path finding.

4. Code Deep Dive

def calcEquation(equations, values, queries):
    """
    Optimal Solution using Floyd-Warshall All-Pairs Shortest Path
    Time: O(V^3 + Q), Space: O(V^2) where V = unique variables
    """
    # Step 1: Collect all variables and create index mapping
    variables = set()
    for A, B in equations:
        variables.add(A)
        variables.add(B)
    
    n = len(variables)
    node_to_idx = {node: i for i, node in enumerate(variables)}
    
    # Step 2: Initialize distance matrix
    dist = [[-1.0] * n for _ in range(n)]
    
    # Self-relationships: a/a = 1.0
    for i in range(n):
        dist[i][i] = 1.0
    
    # Step 3: Populate direct relationships from equations
    for (A, B), value in zip(equations, values):
        i, j = node_to_idx[A], node_to_idx[B]
        dist[i][j] = value       # A/B = value
        dist[j][i] = 1.0 / value # B/A = 1/value
    
    # Step 4: Floyd-Warshall to compute all pairs
    for k in range(n):          # Intermediate node
        for i in range(n):      # Source node  
            for j in range(n):  # Target node
                # If path i→k and k→j exists, compute i→j
                if dist[i][k] != -1.0 and dist[k][j] != -1.0:
                    if dist[i][j] == -1.0:
                        dist[i][j] = dist[i][k] * dist[k][j]
    
    # Step 5: Process queries
    results = []
    for C, D in queries:
        if C not in node_to_idx or D not in node_to_idx:
            results.append(-1.0)  # Undefined variable
        else:
            i, j = node_to_idx[C], node_to_idx[D]
            results.append(dist[i][j])
    
    return results

Line-by-Line Analysis:

Line	Purpose	Edge Case Handling
5-9	Collect unique variables	Handles duplicate variables automatically
12-14	Initialize matrix	Sets unknown relationships to -1.0
17-18	Self-relationships	Ensures a/a = 1.0 for all variables
21-24	Direct edges	Creates bidirectional edges A/B and B/A
27-32	Floyd-Warshall	Only updates unknown paths, preserves known
37-41	Query processing	Checks variable existence before lookup

Edge Case Mapping:

Example 1: [["a","a"],["x","x"]] → Handled by existence check (line 38)
Example 2: [["bc","cd"]] → Multi-character names handled naturally
Example 3: [["a","b"],["b","a"]] → Bidirectional edges ensure consistency

5. Complexity War Room

Hardware-Aware Analysis

Memory Usage: 40 variables → 40×40×8 bytes = 12.8KB (fits in L1 cache)
Cache Performance: Floyd-Warshall has poor locality but small dataset
Parallel Potential: Inner loops parallelizable for larger datasets

Industry Comparison Table

Approach	Time Complexity	Space Complexity	Readability	Interview Viability
Brute Force DFS	$O(Q \cdot (V+E)^D)$	$O(V+E)$	9/10	❌ Exponential worst-case
BFS Path Finding	$O(Q \cdot (V+E))$	$O(V+E)$	8/10	✅ Good for clarity
Union-Find with Weights	$O((E+Q) \cdot \alpha(V))$	$O(V)$	6/10	✅ Optimal but complex
Floyd-Warshall	$O(V^3 + Q)$	$O(V^2)$	7/10	✅ Recommended

Theoretical Bounds

Lower Bound: $\Omega(V^2)$ must store all pairs relationships
Upper Bound: $O(V^3)$ for Floyd-Warshall is optimal for dense graphs
Practical Limit: With V=40, easily handles maximum input size

6. Pro Mode Extras

Algorithm Variants

Variant 1: Multiple Transactions (Weighted Union-Find)

# For problems with frequent updates and queries
parent = {}; weight = {}

def find(x):
    if x not in parent:
        parent[x] = x; weight[x] = 1.0
        return x
    if parent[x] != x:
        root = find(parent[x])
        weight[x] *= weight[parent[x]]
        parent[x] = root
    return parent[x]

def union(x, y, value):
    rootX, rootY = find(x), find(y)
    if rootX != rootY:
        parent[rootX] = rootY
        weight[rootX] = weight[y] * value / weight[x]

Variant 2: Online Queries with Caching

# For streaming queries with memoization
cache = {}
def query(C, D):
    if (C, D) in cache: return cache[(C, D)]
    # Compute and cache result
    # Invalidate cache on new equations

Interview Cheat Sheet

Essential Talking Points:

“This is essentially a graph path finding problem with multiplicative weights”
“Floyd-Warshall gives us O(1) queries after O(V³) preprocessing”
“We must handle bidirectional relationships and disconnected components”
“The no-contradiction guarantee means all paths give the same result”

Common Follow-ups:

“How would you handle contradictions in the equations?”
“What if we could add new equations dynamically?”
“How to minimize floating point error accumulation?”

Optimization Priority:

✅ Check variable existence first
✅ Handle self-queries (a/a = 1.0)
✅ Use Floyd-Warshall for small V
✅ Consider Union-Find for dynamic updates

Red Flags to Avoid:

❌ Not checking for undefined variables
❌ Forgetting bidirectional edges
❌ Using DFS without cycle prevention
❌ Not handling self-query case

#399 - Evaluate Division

Evaluate Division

Problem Description

Solution

1. Problem Deconstruction

Technical Restatement

Beginner-Friendly Version

Mathematical Formulation

Constraint Analysis

2. Intuition Scaffolding

Real-World Metaphor: Currency Exchange Network

Gaming Analogy: Crafting Recipe Progression

Math Analogy: Matrix of Multiplicative Relationships

Common Pitfalls

3. Approach Encyclopedia

Approach 1: Brute Force Path Exploration

Approach 2: BFS with Path Tracking

Approach 3: Floyd-Warshall All-Pairs (Optimal)

4. Code Deep Dive

5. Complexity War Room

Hardware-Aware Analysis

Industry Comparison Table

Theoretical Bounds

6. Pro Mode Extras

Algorithm Variants

Interview Cheat Sheet