Step 1 — Why an optimising middle-end

A compiler that emits unoptimised IR straight to a backend gets you correct code; that is all. Everything good about a modern compiler — inlining, escape analysis, vectorisation, devirtualisation — happens in the middle-end, after the frontend has built an IR and before the backend lowers it to machine code.

The single most important pattern

IR_in  →  pass_1  →  IR_intermediate  →  pass_2  →  IR_intermediate' → ...

Each pass is a function IR → IR that preserves the program's observable behaviour. The middle-end's job is to thread enough passes that the IR that comes out the other side is small, dense, and easy for the backend.

A pass manager schedules and re-runs passes. LLVM's new pass manager (2020) tracks per-pass invalidation of analyses; we use a much simpler "run everything to a fixed point" loop. Both designs share the same invariants:

  • Passes do not change observable behaviour (no I/O reordering, no new observable allocations, no new exceptions).
  • Passes are robust: malformed input (e.g. a cjmp on a constant) should be cleaned up, not crashed on.
  • Passes are composable: running pass A then pass B should produce the same IR as running them as a single fused pass would.

Why ordering matters

Our pipeline runs constFold → dce → simplifyCFG. Each unlocks the next:

  • constFold rewrites cjmp t0, bb1, bb2 (with t0 constant) to jmp bb1. The bb2 block is now unreachable — but that fact isn't visible inside constFold.
  • simplifyCFG drops the unreachable block. The temps defined in it are now uses-of-nothing.
  • dce drops those orphaned defs.
  • Some of those defs were the only uses of upstream temps. So we run the whole pipeline again. Repeat to fixed point.

LLVM calls this the "phase-ordering problem" and ships hundreds of passes; finding a good static order is genuinely hard. Our loop is the brute-force version: keep going until nobody changes anything.

What we'll cover

  • Step 2 — How an IR interpreter works and why it's the semantic oracle.
  • Step 3 — SSA: what it is, why every modern compiler uses it, why we don't fully implement it here.
  • Step 4 — Constant folding + propagation, including the single-def temp substitution trick that approximates SSA.
  • Step 5 — Dead-code elimination and why purity matters.
  • Step 6 — CFG simplification.
  • Step 7 — The pass-manager loop and how to think about phase ordering.