← ~/lc-grind/posts

#399 - Evaluate Division

November 26, 2025

Evaluate Division

Problem Description

You are given an array of variable pairs equations and an array of real numbers values, where equations[i] = [Ai, Bi] and values[i] represent the equation Ai / Bi = values[i]. Each Ai or Bi is a string that represents a single variable.

You are also given some queries, where queries[j] = [Cj, Dj] represents the jth query where you must find the answer for Cj / Dj = ?.

Return the answers to all queries. If a single answer cannot be determined, return -1.0.

Note: The input is always valid. You may assume that evaluating the queries will not result in division by zero and that there is no contradiction.

**Note:**The variables that do not occur in the list of equations are undefined, so the answer cannot be determined for them.

Example 1:


Input: equations = [["a","b"],["b","c"]], values = [2.0,3.0], queries = [["a","c"],["b","a"],["a","e"],["a","a"],["x","x"]]
Output: [6.00000,0.50000,-1.00000,1.00000,-1.00000]
Explanation: 
Given: a / b = 2.0, b / c = 3.0
queries are: a / c = ?, b / a = ?, a / e = ?, a / a = ?, x / x = ? 
return: [6.0, 0.5, -1.0, 1.0, -1.0 ]
note: x is undefined => -1.0

Example 2:


Input: equations = [["a","b"],["b","c"],["bc","cd"]], values = [1.5,2.5,5.0], queries = [["a","c"],["c","b"],["bc","cd"],["cd","bc"]]
Output: [3.75000,0.40000,5.00000,0.20000]

Example 3:


Input: equations = [["a","b"]], values = [0.5], queries = [["a","b"],["b","a"],["a","c"],["x","y"]]
Output: [0.50000,2.00000,-1.00000,-1.00000]

Constraints:

  • 1 <= equations.length <= 20
  • equations[i].length == 2
  • 1 <= Ai.length, Bi.length <= 5
  • values.length == equations.length
  • 0.0 < values[i] <= 20.0
  • 1 <= queries.length <= 20
  • queries[i].length == 2
  • 1 <= Cj.length, Dj.length <= 5
  • Ai, Bi, Cj, Dj consist of lower case English letters and digits.

Solution

1. Problem Deconstruction

Technical Restatement

Given a directed weighted graph representation of variable relationships where:

  • Nodes represent string variables
  • Directed edges A → B have weight values[i] representing A/B = values[i]
  • Implicit reverse edges B → A exist with weight 1/values[i]
  • Must compute division results for query pairs by finding valid paths and multiplying edge weights
  • Return -1.0 for undefined variables or disconnected components

Beginner-Friendly Version

Imagine you have a list of conversion rates between different currencies. For example:

  • “2 dollars = 1 euro” means dollar/euro = 2.0
  • “3 euros = 1 pound” means euro/pound = 3.0

If someone asks “How many dollars per pound?”, you’d multiply: dollars/euro × euros/pound = 2.0 × 3.0 = 6.0

Some currencies might not be connected (like “yen” not mentioned), so you can’t convert between them.

Mathematical Formulation

Given:

  • Set of equations: E={(Ai,Bi)}E = \{(A_i, B_i)\} with values V={vi}V = \{v_i\} where AiBi=vi\frac{A_i}{B_i} = v_i
  • Set of queries: Q={(Cj,Dj)}Q = \{(C_j, D_j)\}

Find: CjDj=k=1mwk\frac{C_j}{D_j} = \prod_{k=1}^{m} w_k where wkw_k are edge weights along path CjDjC_j \rightarrow \cdots \rightarrow D_j

If no path exists or variables undefined: CjDj=1\frac{C_j}{D_j} = -1

Constraint Analysis

Constraint Impact Analysis Edge Cases
1 <= equations.length <= 20 Small enough for exponential algorithms Minimal equations (1) vs maximum (20)
equations[i].length == 2 Consistent bipartite structure No self-loops in input data
1 <= Ai.length, Bi.length <= 5 Short variable names, hash efficiency Single character vs 5-character names
values.length == equations.length 1:1 mapping guaranteed No missing or extra conversion factors
0.0 < values[i] <= 20.0 No zero/negative division issues Very small (>0) vs large (20.0) values
1 <= queries.length <= 20 Query count manageable Single query optimization possible
Cj.length, Dj.length <= 5 Consistent variable naming Unknown variables easily detectable
Hidden Constraints Algorithmic Implications Test Considerations
No division by zero All denominators valid in queries Self-query (a/a) always = 1.0
No contradiction Graph is consistent All paths between nodes give same result
Variables may be undefined Must track existence Queries with undefined variables

2. Intuition Scaffolding

Real-World Metaphor: Currency Exchange Network

Think of variables as different currencies and values as exchange rates. If:

  • USD/EUR = 0.8 and EUR/GBP = 0.9
  • Then USD/GBP = USD/EUR × EUR/GBP = 0.8 × 0.9 = 0.72

The graph represents this currency conversion network where finding exchange rates becomes path finding.

Gaming Analogy: Crafting Recipe Progression

In games like Minecraft or Factorio, recipes form dependency chains:

  • 2 Wood → 1 Plank
  • 4 Planks → 1 Crafting Table
  • Query: “How much Wood per Crafting Table?” = 2 × 4 = 8

Unknown materials represent disconnected components in the crafting graph.

Math Analogy: Matrix of Multiplicative Relationships

Each equation AB=v\frac{A}{B} = v creates relationships in a multiplicative group. The problem reduces to solving:

Mx=bM \cdot \vec{x} = \vec{b}

where M represents the coefficient matrix of the ratio relationships.

Common Pitfalls

  1. Global Min/Max Fallacy: Trying to find “cheapest conversion” rather than exact multiplicative path
  2. Bidirectional Edge Confusion: Forgetting that A/B = v implies B/A = 1/v
  3. Path Existence Assumption: Assuming all variables are connected when they form separate components
  4. Self-Query Oversight: Missing that a/a = 1.0 regardless of equations
  5. Floating Point Precision: Accumulating rounding errors in long multiplication chains
  6. Early Termination: Stopping BFS/DFS at first solution when multiple paths should give same result
  7. Variable Existence Check: Not verifying both query variables exist in the equation set

3. Approach Encyclopedia

Approach 1: Brute Force Path Exploration

Algorithm: BruteForceEvaluateDivision
Input: equations, values, queries
Output: results array

1. Build adjacency list from equations and values
   - For each (A, B, value): add A→B:value and B→A:1/value
   
2. For each query (C, D):
   a. If C or D not in graph: result ← -1.0
   b. Else if C == D: result ← 1.0
   c. Else:
        visited ← set()
        result ← DFS(C, D, 1.0, visited)
        If result not found: return -1.0

3. DFS(current, target, product, visited):
   a. If current == target: return product
   b. Mark current visited
   c. For each neighbor with weight:
        If neighbor not visited:
            result ← DFS(neighbor, target, product × weight, visited)
            If result ≠ -1: return result
   d. Return -1.0

Complexity Proof:

  • Time: O(Q(V+E)D)O(Q \cdot (V + E)^{D}) where D = maximum depth
    • Worst case: O(204020)O(20 \cdot 40^{20}) with 20 queries and chain length 20
  • Space: O(V+E)O(V + E) for graph + O(D)O(D) recursion depth

Visualization:

Equations: a/b=2, b/c=3

Graph:
a → b (2.0)    b → a (0.5)
b → c (3.0)    c → b (0.333)

Query a/c:
Start: a (product=1.0)
  → b (1.0×2.0=2.0)
    → c (2.0×3.0=6.0) ✓

Approach 2: BFS with Path Tracking

Algorithm: BFSEvaluateDivision
Input: equations, values, queries
Output: results array

1. Build graph as adjacency list
2. For each query (C, D):
   a. If C or D not in graph: -1.0
   b. If C == D: 1.0
   c. Else:
        queue ← [(C, 1.0)], visited ← {C}
        While queue not empty:
            (node, product) ← dequeue
            If node == D: return product
            For each neighbor with weight:
                If neighbor not visited:
                    visited.add(neighbor)
                    enqueue(neighbor, product × weight)
        Return -1.0

Complexity Proof:

  • Time: O(Q(V+E))O(Q \cdot (V + E)) = O(2040)O(20 \cdot 40) = O(800)O(800) worst case
  • Space: O(V+E)O(V + E) for graph + O(V)O(V) for BFS queue

Visualization:

Query a/c:
Queue: [(a,1.0)]
Process a: neighbors b(2.0) → Queue: [(b,2.0)]
Process b: neighbors a(0.5), c(3.0)
  Skip a (visited), Add c → Queue: [(c,6.0)]
Process c: Found target! Return 6.0

Approach 3: Floyd-Warshall All-Pairs (Optimal)

Algorithm: FloydWarshallEvaluateDivision
Input: equations, values, queries  
Output: results array

1. Collect all unique variables → list nodes
2. Create distance matrix dist[n][n] with:
   - dist[i][i] = 1.0
   - dist[i][j] = known value if edge i→j exists
   - Else dist[i][j] = -1.0 (unknown)
   
3. For k from 0 to n-1:
      For i from 0 to n-1:
          For j from 0 to n-1:
              If dist[i][k] ≠ -1 and dist[k][j] ≠ -1:
                 If dist[i][j] == -1:
                    dist[i][j] = dist[i][k] × dist[k][j]

4. For each query (C, D):
      i = indexOf(C), j = indexOf(D)
      Return dist[i][j] if both exist, else -1.0

Complexity Proof:

  • Time: O(V3+Q)O(V^3 + Q) where V ≤ 40 → O(403+20)O(40^3 + 20) = O(64020)O(64020)
  • Space: O(V2)O(V^2) = O(1600)O(1600) for distance matrix

Mathematical Derivation: Floyd-Warshall works because division relationships are transitive: If AB=x\frac{A}{B} = x and BC=y\frac{B}{C} = y then AC=x×y\frac{A}{C} = x \times y

The algorithm computes: dist[i][j]=mink(dist[i][k]×dist[k][j])dist[i][j] = \min_k(dist[i][k] \times dist[k][j]) but since all paths give same result (no contradictions), it’s just path finding.

4. Code Deep Dive

def calcEquation(equations, values, queries):
    """
    Optimal Solution using Floyd-Warshall All-Pairs Shortest Path
    Time: O(V^3 + Q), Space: O(V^2) where V = unique variables
    """
    # Step 1: Collect all variables and create index mapping
    variables = set()
    for A, B in equations:
        variables.add(A)
        variables.add(B)
    
    n = len(variables)
    node_to_idx = {node: i for i, node in enumerate(variables)}
    
    # Step 2: Initialize distance matrix
    dist = [[-1.0] * n for _ in range(n)]
    
    # Self-relationships: a/a = 1.0
    for i in range(n):
        dist[i][i] = 1.0
    
    # Step 3: Populate direct relationships from equations
    for (A, B), value in zip(equations, values):
        i, j = node_to_idx[A], node_to_idx[B]
        dist[i][j] = value       # A/B = value
        dist[j][i] = 1.0 / value # B/A = 1/value
    
    # Step 4: Floyd-Warshall to compute all pairs
    for k in range(n):          # Intermediate node
        for i in range(n):      # Source node  
            for j in range(n):  # Target node
                # If path i→k and k→j exists, compute i→j
                if dist[i][k] != -1.0 and dist[k][j] != -1.0:
                    if dist[i][j] == -1.0:
                        dist[i][j] = dist[i][k] * dist[k][j]
    
    # Step 5: Process queries
    results = []
    for C, D in queries:
        if C not in node_to_idx or D not in node_to_idx:
            results.append(-1.0)  # Undefined variable
        else:
            i, j = node_to_idx[C], node_to_idx[D]
            results.append(dist[i][j])
    
    return results

Line-by-Line Analysis:

Line Purpose Edge Case Handling
5-9 Collect unique variables Handles duplicate variables automatically
12-14 Initialize matrix Sets unknown relationships to -1.0
17-18 Self-relationships Ensures a/a = 1.0 for all variables
21-24 Direct edges Creates bidirectional edges A/B and B/A
27-32 Floyd-Warshall Only updates unknown paths, preserves known
37-41 Query processing Checks variable existence before lookup

Edge Case Mapping:

  • Example 1: [["a","a"],["x","x"]] → Handled by existence check (line 38)
  • Example 2: [["bc","cd"]] → Multi-character names handled naturally
  • Example 3: [["a","b"],["b","a"]] → Bidirectional edges ensure consistency

5. Complexity War Room

Hardware-Aware Analysis

  • Memory Usage: 40 variables → 40×40×8 bytes = 12.8KB (fits in L1 cache)
  • Cache Performance: Floyd-Warshall has poor locality but small dataset
  • Parallel Potential: Inner loops parallelizable for larger datasets

Industry Comparison Table

Approach Time Complexity Space Complexity Readability Interview Viability
Brute Force DFS O(Q(V+E)D)O(Q \cdot (V+E)^D) O(V+E)O(V+E) 9/10 ❌ Exponential worst-case
BFS Path Finding O(Q(V+E))O(Q \cdot (V+E)) O(V+E)O(V+E) 8/10 ✅ Good for clarity
Union-Find with Weights O((E+Q)α(V))O((E+Q) \cdot \alpha(V)) O(V)O(V) 6/10 ✅ Optimal but complex
Floyd-Warshall O(V3+Q)O(V^3 + Q) O(V2)O(V^2) 7/10 Recommended

Theoretical Bounds

  • Lower Bound: Ω(V2)\Omega(V^2) must store all pairs relationships
  • Upper Bound: O(V3)O(V^3) for Floyd-Warshall is optimal for dense graphs
  • Practical Limit: With V=40, easily handles maximum input size

6. Pro Mode Extras

Algorithm Variants

Variant 1: Multiple Transactions (Weighted Union-Find)

# For problems with frequent updates and queries
parent = {}; weight = {}

def find(x):
    if x not in parent:
        parent[x] = x; weight[x] = 1.0
        return x
    if parent[x] != x:
        root = find(parent[x])
        weight[x] *= weight[parent[x]]
        parent[x] = root
    return parent[x]

def union(x, y, value):
    rootX, rootY = find(x), find(y)
    if rootX != rootY:
        parent[rootX] = rootY
        weight[rootX] = weight[y] * value / weight[x]

Variant 2: Online Queries with Caching

# For streaming queries with memoization
cache = {}
def query(C, D):
    if (C, D) in cache: return cache[(C, D)]
    # Compute and cache result
    # Invalidate cache on new equations

Interview Cheat Sheet

Essential Talking Points:

  1. “This is essentially a graph path finding problem with multiplicative weights”
  2. “Floyd-Warshall gives us O(1) queries after O(V³) preprocessing”
  3. “We must handle bidirectional relationships and disconnected components”
  4. “The no-contradiction guarantee means all paths give the same result”

Common Follow-ups:

  • “How would you handle contradictions in the equations?”
  • “What if we could add new equations dynamically?”
  • “How to minimize floating point error accumulation?”

Optimization Priority:

  1. ✅ Check variable existence first
  2. ✅ Handle self-queries (a/a = 1.0)
  3. ✅ Use Floyd-Warshall for small V
  4. ✅ Consider Union-Find for dynamic updates

Red Flags to Avoid:

  • ❌ Not checking for undefined variables
  • ❌ Forgetting bidirectional edges
  • ❌ Using DFS without cycle prevention
  • ❌ Not handling self-query case