2017-01-09

Iterating two enumerable objects simultaneously

blogentry, programming, c, crackingthecoding

banner

I was solving a question on Cracking the Coding Interview Edition 6 question 4.1

Given a directed graph, design an algorithm to find out whether there is a route between two nodes.

My first attempt was a brute force approach of iterating all nodes in a graph. But then I thought about the possibility of having to deal with thousands or millions of nodes in a graph and my approach would not work.

1public bool HasRouteUsingDfs<T>(Node<T> node1, Node<T> node2)2{3	var nodeValues1 = GetValuesUsingDfs(node1).ToList();4	var nodeValues2 = GetValuesUsingDfs(node2).ToList();5
6	return nodeValues1.Contains(node2.Name) && nodeValues2.Contains(node1.Name);7}

where GetValuesUsingDfs(...)is just a graph iteration function using DFS (Depth-First Search) algorithm.

1private IEnumerable<T> GetValuesUsingDfs<T>(Node<T> node)2{3	var stack = new Stack<Node<T>>();4	stack.Push(node);5
6	while (stack.Count > 0)7	{8		var v = stack.Pop();9		if (!v.IsVisited)10		{11			v.IsVisited = true;12			yield return v.Name;13
14			foreach (Node<T> child in v.Children)15			{16				stack.Push(child);17			}18		}19	}20}

To be able to pull that off, I had to think of a way to run returned enumerators of GetValuesUsingDfs(...)  at the same time. Until now, I've been simply been using two nested for/foreach loops to accomplish such tasks. But ever since I've been conscious about Big O (where using two nested for/foreach would require B(N^2)), I was looking for a different way.

So the optimization technique I thought of was to iterate from both sides and if there is a common node between the two nodes, then there is a route. I can't exactly come up with any other striking idea since I am still reading the book, but I am sure that I will be able to apply more optimizations as I read the book.

Here is the code that utilizes simultaneous iteration.

1public bool HasRouteUsingDfs2(Node<int> node1, Node<int> node2)2{3	var nodes1 = GetValuesUsingDfs(node1);4	var nodes2 = GetValuesUsingDfs(node2);5
6	// Enumerate multiple lists one at a time using Depth-First Search7	using (var n1 = nodes1.GetEnumerator())8	using (var n2 = nodes2.GetEnumerator())9	{10		while (n1.MoveNext() && n2.MoveNext())11		{12			var nodeValue1 = n1.Current;13			var nodeValue2 = n2.Current;14
15			if (nodeValue1 == nodeValue2) return true;16		}17	}18
19	return false;20}

Now the function terminates as soon as a common node between the two nodes is found, which runs much faster for four of my test cases.  I've found simultaneous iteration code from StackOverflow as usual from this question answered by Eren Ersönmez (Answer is here; Eren also created a generic "ZipAll" function that accepts any number of IEnumerables in the answer.)

While four tests using HasRouteUsingDfs ran for 0.06 seconds, optimized version using HasRouteUsingDfs2 ran within 0.001 seconds.

Full source for above codes is available on GitHub. You'd need XUnit to run the tests.

Conclusion

It's not possible to iterate multiple enumerators using "foreach" but can use "GetEnumerator". I've been quite conscious about writing optimized code and this one just rocked since I never knew about this trick.