In the last post, I presented some basic algorithms to operate on the dataflow dependency graph. I will assume now that you have both the flow graph, and the topological dependency list. (Recall the dependency list is a vector where each slot is a set of independent nodes. Understanding this is important.)

The next next algorithm we will discuss is *perform flow*. It takes a series of cells needing update, and follows along the flow graphs updating each, along with all cells which depend on them. The critical aspect of this algorithm is that is does no unnecessary work. No cell is visited twice. No cell is touched if a dependent did not change.

Here it is:

Requirements: A flow graph (fg) A topological dependency list (dl) A set of cells needing update (needed) For each d in dl If needed is empty, break out, you’re done For each cell in the intersection of needed and d Recompute the cell, if it isn’t a source cell If the cell changed add all its neighbors from the flow graph to needed. (Note: source cells always "changed" for this purpose) Remove cell from needed.

This iterates over the topological vector, visiting the cells at each level that need updating. If a cell is updated, those that immediately depend on it are added to the needed list (note, those new cells will never be in the current or an earlier strata of the dependency list, or else it was computed incorrectly). This continues until you run out of needed cells, or complete the dependency list (the latter should not occur if the first does not, although the algorithm does not check).

When you first establish your model, you will need to set all the source cells to their initial value, and the add all of them to the needed list, and run the algorithm. When you update a cell (or cells), set those that directly changed to the needed list, and also run the algorithm.

*Next: I will discuss the Clojure implementation of all of this.*