Computational Graphs

A spreadsheet sits in front of you. Cell A1 holds 5. Cell B1 holds the formula A1 * 2. Cell C1 holds B1 + A1. Change A1 to 7 and the sheet recomputes B1 and C1 in the right order without you lifting a finger. That recalculation engine is a directed acyclic graph — a DAG — and it is the fourth material in the workshop. You have numbers, you have rates of change, you have uncertainty. You did not have a structure that holds them all together. The DAG does. Every neural network framework on earth runs on one underneath.

Robert Wengert wrote the first paper describing this idea in 1964 at the Naval Ordnance Laboratory. He was trying to compute derivatives of large expressions on a computer with a few kilobytes of memory and noticed that if you broke the expression into a graph of small operations, you could walk the graph forward to evaluate it and walk it backward to differentiate it. The idea sat almost unused for 40 years. Then in 2007 a team at MILA in Montreal built Theano, the first practical computational graph library for deep learning. You wrote a Python expression, Theano built a graph, and a compiler turned the graph into fast GPU code. Every framework that came after — TensorFlow in 2015, PyTorch with dynamic graphs in 2017 that won the framework war by letting you build and modify the graph on the fly, JAX in 2018 taking the same idea functional — is the same DAG with a different skin.

A three-cell spreadsheet redrawn as a directed acyclic graph: A1 feeds B1, A1 and B1 feed C1.

A graph is a set of nodes connected by edges. Directed means every edge has an arrow — A points to B, not the other way around. Acyclic means you cannot follow the arrows in a loop and end up back where you started. A spreadsheet enforces this rule. If A1 depended on B1 and B1 depended on A1, neither cell could ever finish computing. The sheet would refuse and flash an error. The same rule applies to neural networks, which is why the data flow always moves one direction from input to output during the forward pass.

Start with a single cell. A cell has a name like A1, a raw value the user typed (a number or a formula string), a computed value after evaluation, and a list of cells whose formulas mention it.

class Cell:
    def __init__(self, name):
        self.name = name
        self.raw = ""
        self.value = 0.0
        self.dependents = set()
 
    def __repr__(self):
        return f"{self.name}={self.value}"

The dependents set is the answer to "if I change, who needs to be told." That set is built by reading each formula and writing down which cells it references. To do that you need a parser. A formula like A1 * 2 + B1 is a string; the engine cannot multiply a string. The parser turns the string into a tiny tree of operations.

Tokenize first. Walk the string character by character and emit a list of tokens — numbers, cell names, and operators.

def tokenize(formula):
    tokens = []
    i = 0
    while i < len(formula):
        c = formula[i]
        if c.isspace():
            i += 1
            continue
        if c in "+-*/()":
            tokens.append(c)
            i += 1
            continue
        if c.isalpha():
            j = i
            while j < len(formula) and formula[j].isalnum():
                j += 1
            tokens.append(formula[i:j])
            i = j
            continue
        if c.isdigit() or c == ".":
            j = i
            while j < len(formula) and (formula[j].isdigit() or formula[j] == "."):
                j += 1
            tokens.append(formula[i:j])
            i = j
            continue
        raise ValueError(f"unknown character: {c}")
    return tokens
 
print(tokenize("A1 * 2 + B1"))

['A1', '*', '2', '+', 'B1']

Five tokens, no spaces, ready to parse. The parser turns those tokens into a tree where each node is either a number, a cell reference, or a binary operation with a left child and a right child. The tree for A1 * 2 + B1 looks like an addition at the root, a multiplication on its left, and the cell B1 on its right.

class Num:
    def __init__(self, value):
        self.value = value
 
    def eval(self, sheet):
        return self.value
 
    def refs(self):
        return set()
 
class Ref:
    def __init__(self, name):
        self.name = name
 
    def eval(self, sheet):
        return sheet.get(self.name).value
 
    def refs(self):
        return {self.name}
 
class BinOp:
    def __init__(self, op, left, right):
        self.op = op
        self.left = left
        self.right = right
 
    def eval(self, sheet):
        a = self.left.eval(sheet)
        b = self.right.eval(sheet)
        if self.op == "+": return a + b
        if self.op == "-": return a - b
        if self.op == "*": return a * b
        if self.op == "/": return a / b
        raise ValueError(self.op)
 
    def refs(self):
        return self.left.refs() | self.right.refs()

Three node types, each with eval (compute the value given the sheet) and refs (return the set of cell names this node touches). Build the tree with a small recursive parser that respects operator precedence — multiply and divide bind tighter than plus and minus.

def parse(tokens):
    pos = [0]
 
    def peek():
        return tokens[pos[0]] if pos[0] < len(tokens) else None
 
    def take():
        tok = tokens[pos[0]]
        pos[0] += 1
        return tok
 
    def atom():
        tok = take()
        if tok == "(":
            node = expr()
            assert take() == ")"
            return node
        if tok[0].isalpha():
            return Ref(tok)
        return Num(float(tok))
 
    def term():
        node = atom()
        while peek() in ("*", "/"):
            op = take()
            node = BinOp(op, node, atom())
        return node
 
    def expr():
        node = term()
        while peek() in ("+", "-"):
            op = take()
            node = BinOp(op, node, term())
        return node
 
    return expr()

Now for the dependency graph itself. The graph holds an edge from cell A to cell B if B's formula mentions A. When A changes, every cell B that depends on A must recompute, then every cell that depends on those, and so on. The order matters — recompute C1 before B1 and C1 will still see the old B1 value. The fix is topological sort, which produces a linear ordering of nodes such that every edge points forward in the list.

Kahn's algorithm from 1962 is the cleanest way to do it. Start with nodes that have no incoming edges (nothing depends on them being computed first). Repeatedly take such a node, add it to the output list, and remove its outgoing edges. If you process every node, you have a valid order. If you get stuck with edges left over, the graph has a cycle and is not a DAG.

def topological_sort(graph, start):
    incoming = {node: 0 for node in graph.reachable_from(start)}
    for node in incoming:
        for child in graph.children_of(node):
            if child in incoming:
                incoming[child] += 1
    ready = [n for n, count in incoming.items() if count == 0]
    order = []
    while ready:
        node = ready.pop(0)
        order.append(node)
        for child in graph.children_of(node):
            if child in incoming:
                incoming[child] -= 1
                if incoming[child] == 0:
                    ready.append(child)
    if len(order) != len(incoming):
        raise ValueError("cycle detected")
    return order

Wire it together in a Spreadsheet that owns all the cells and routes every change through the graph.

class Spreadsheet:
    def __init__(self):
        self.cells = {}
 
    def get(self, name):
        if name not in self.cells:
            self.cells[name] = Cell(name)
        return self.cells[name]
 
    def set_cell(self, name, raw):
        cell = self.get(name)
        for other in self.cells.values():
            other.dependents.discard(name)
        cell.raw = raw
        if raw.startswith("="):
            tree = parse(tokenize(raw[1:]))
            cell.formula = tree
            for ref in tree.refs():
                self.get(ref).dependents.add(name)
        else:
            cell.formula = Num(float(raw))
        order = self.recalc_order(name)
        print(f"recalc order: {' -> '.join(order)}")
        for n in order:
            target = self.get(n)
            old = target.value
            target.value = target.formula.eval(self)
            print(f"  {n}: {old} -> {target.value}")

recalc_order does the topological sort starting from the changed cell and walking through dependents.

Try it on a small sheet. Set A1 to 5, B1 to =A1 * 2, C1 to =B1 + A1, then change A1 to 7. The recompute should fire B1 first, then C1.

recalc order: A1 -> B1 -> C1
  A1: 5.0 -> 7.0
  B1: 10.0 -> 14.0
  C1: 15.0 -> 21.0

A1 moves first because it is the source. B1 doubles. C1 sums the new B1 and the new A1 and lands at 21. Try the wrong order in your head — if C1 ran before B1, it would compute 10 + 7 = 17, off by 4. Topological order is the only safe order.

What stops a user from typing C1 = =A1 and then A1 = =C1? Nothing on the way in. The cycle detector inside topological_sort catches it on the way out — when there are nodes left with non-zero incoming counts, no order exists, and the engine raises. The spreadsheet refuses the edit and explains why.

Kahn's algorithm walking a small DAG: peel off zero-incoming nodes one at a time until a flat order remains.

The interactive loop is short. Read a line, parse name=value, call set_cell, repeat.

sheet = Spreadsheet()
while True:
    line = input("> ").strip()
    if not line: continue
    if line == "quit": break
    name, _, raw = line.partition("=")
    sheet.set_cell(name.strip(), raw.strip())

Type A1=5, then B1==A1*2, then C1==B1+A1, then change A1=7, and watch the chain fire. Type A1==C1 after that and the cycle detector throws.

This pattern is the entire architecture of a deep learning framework. A neural network is a set of operations on tensors — matrix multiply, addition, activation — wired into a DAG. The forward pass walks the graph in topological order from inputs to output, the same way the spreadsheet walks cells in topological order from a changed cell to its furthest dependent. The backward pass walks the same graph in reverse to compute gradients via the chain rule, which is exactly what Wengert sketched in 1964 and Theano shipped in 2007. The cells are tensors. The formulas are layers. The recalculation order is the forward pass. Every framework you will ever touch is some variation of what you just built.

You have all four materials. Time to forge the first part of the brain — a single neuron.