Step 4 — The Evaluator (Tree-Walking Interpreter)

Goal: given the AST, compute the number it represents.

Three Lines Per Node

double Evaluator::visit(NumberExpr& n) { return n.value; }

double Evaluator::visit(BinaryExpr& b) {
    double l = b.lhs->accept(*this);
    double r = b.rhs->accept(*this);
    switch (b.op) {
        case TokenKind::Plus:  return l + r;
        case TokenKind::Minus: return l - r;
        case TokenKind::Star:  return l * r;
        case TokenKind::Slash:
            if (r == 0.0) throw EvalError("division by zero");
            return l / r;
        default: throw EvalError("unknown binary operator");
    }
}

double Evaluator::visit(UnaryExpr& u) {
    double v = u.operand->accept(*this);
    switch (u.op) {
        case TokenKind::Minus: return -v;
        default: throw EvalError("unknown unary operator");
    }
}

That's the entire interpreter for arithmetic. ~25 lines.

Post-Order Traversal

BinaryExpr::visit evaluates both children first, then combines. This is post-order traversal — the only correct order for expression evaluation. (Pre-order would try to "+" before knowing what to "+".)

        BinaryExpr(*)
       /            \
  BinaryExpr(+)  NumberExpr(3)
   /        \
 (1)        (2)

Evaluation order (post-order):
  1.  NumberExpr(1)  → 1
  2.  NumberExpr(2)  → 2
  3.  BinaryExpr(+)  → 1 + 2 = 3
  4.  NumberExpr(3)  → 3
  5.  BinaryExpr(*)  → 3 * 3 = 9

In visit(BinaryExpr&), the two accept(*this) lines recursively evaluate the subtrees. The C++ call stack mirrors the tree shape — a 10-deep expression makes 10 stack frames.

What accept(*this) Does, Step By Step

double l = b.lhs->accept(*this);
  1. b.lhs is std::unique_ptr<Expr> — dereference gets Expr&.
  2. Expr::accept is virtual; dispatched to (say) NumberExpr::accept.
  3. That calls visitor.visit(*this) with *this = NumberExpr&.
  4. Overload resolution picks Evaluator::visit(NumberExpr&).
  5. Returns n.value.

For each node visit there's: 1 virtual call (accept), 1 overload resolution (statically resolved), 1 chain of work. The virtual call is the main cost of tree-walking and is exactly what bytecode VMs eliminate.

Errors Surface Bottom-Up

Division by zero at any depth throws EvalError, which unwinds through every accept/visit frame back to the top-level eval. The CLI catches it and prints a message.

This works because C++ exceptions propagate transparently through virtual calls. Manual error-handling — say, returning a sentinel — would require checking after every recursive call. Exceptions handle the "anywhere in this subtree" case for free.

Modern compilers like Clang largely don't use exceptions in their AST passes (the LLVM project disables them — -fno-exceptions). They use Expected<T> / ErrorOr<T> value-types that force explicit handling. We use exceptions here for simplicity; future labs will switch when the costs/benefits flip.

What Happens For 3 * -4?

Tokens: NUMBER(3) STAR MINUS NUMBER(4) EOF

Parser:

  • parseExprparseTerm
  • parseTermparseFactorNumberExpr(3)
  • sees *, consume
  • parseFactor → sees -, consume → parseFactor recurse → NumberExpr(4)UnaryExpr(-, 4)
  • builds BinaryExpr(*, NumberExpr(3), UnaryExpr(-, NumberExpr(4)))

Evaluation:

  • visit(BinaryExpr*):
    • left = visit(NumberExpr 3) = 3
    • right = visit(UnaryExpr-):
      • operand = visit(NumberExpr 4) = 4
      • return -4
    • return 3 * -4 = -12

Same shape as any other binary op — the unary minus is just one extra level of recursion.

Why Tree-Walking Is Slow

Per node visit:

  • 1 virtual dispatch (~10ns; branch predictor cold the first time)
  • 1 pointer-indirection to the next node (likely cache miss for big trees)
  • C++ function-call overhead (stack frame, saved registers)

For a 1000-instruction expression: ~30-50µs evaluating. A bytecode VM doing the same: ~5µs. Native code: ~1µs.

Numbers vary wildly by language and CPU; the ratio is consistent. We'll measure precisely in cp-06 when we have a bytecode VM to compare against.

Why It's Still Worth Building

For learning, tree-walkers win:

  • Smallest code surface — every concept is local.
  • Easy to debug: print the tree, watch it walk.
  • Easy to add features: new node type → new visit overload.

Tree-walkers are also adequate for many real workloads — config languages, shell scripts, build files, query languages. Performance-critical paths get bytecode (cp-06+); the rest stay tree-walked.

Try It

cd src/cpp
cmake -B build && cmake --build build
./build/eval "1 + 2 * 3"
# 7
./build/eval "((1 + 2) * (3 + 4) - 5) / 2"
# 8
./build/eval "1 / 0"
# division by zero

Next

05-repl-tests-and-cli.md — wire the front-end into a usable CLI + tests.