Step 6 — CFG simplification

After constFold rewrites a cjmp to a jmp, one of the targets becomes unreachable. The instructions in that block — prints, function calls, everything — are still there in fn.blocks. They take no runtime time (nothing branches to them) but they bloat the IR, confuse later passes, and confuse humans reading the dump.

simplifyCFG is the pass that throws them away.

The algorithm

unordered_set<int> reachable;
vector<int> stack{ fn.blocks[0].id };
while (!stack.empty()) {
    int id = stack.back(); stack.pop_back();
    if (!reachable.insert(id).second) continue;
    // For each successor (jmp / cjmp / ret) of bb id, push onto stack.
}
// Drop blocks not in `reachable`.

Pure flood-fill from the entry block (always bb0). Successors come from inspecting the terminator of each block — the one place in the IR where control flow is encoded explicitly. That's the entire point of the basic-block model: control flow is structural, not embedded in straight-line instructions.

Other things simplifyCFG could do

We only do reachability pruning. Production-grade implementations also:

  • Block merging. If bb1 ends in jmp bb2 and bb2 has only bb1 as predecessor, merge them.
  • Empty-block bypass. If bb1's only instruction is jmp bb2, rewrite every predecessor of bb1 to target bb2 directly.
  • Branch threading. If bb1 ends in cjmp t, bb2, bb3 and bb2 is a known-constant block (e.g. cjmp false, ...), thread directly.
  • Hoisting common code. If both arms of a cjmp begin with the same instruction, hoist it before the branch.

LLVM's SimplifyCFG pass implements all of these and quite a few more.

A subtle correctness rule

Whenever you delete a block, you must also check whether any remaining block referenced it as a successor — and if so, that reference is now invalid. Our reachability flood guarantees this: nothing reaches a deleted block, by definition, so no remaining terminator points to one.

In a more aggressive pass (block merging), you must rewrite phi nodes' incoming-block lists when you merge predecessors. We don't have phi nodes, so we sidestep that landmine. Welcome to mem2reg in cp-10/11.

Renumbering, or not?

We don't renumber block ids after pruning. bb0, bb2, bb4 is fine in the printer — readers know it's a sparse sequence. If we did renumber, every bbT / bbF field in every terminator would need updating. That's fragile and offers no semantic benefit.

LLVM and Cranelift both keep sparse block numbering for the same reason.

Composition with other passes

loop {
    a = constFold(fn);       // may rewrite cjmp → jmp
    b = dce(fn);             // may drop now-unused temps
    c = simplifyCFG(fn);     // may drop now-unreachable blocks
    if (!a && !b && !c) break;
}

The order matters: constFold first (creates work for the others), then dce, then simplifyCFG. Swap any pair and you'll get the same fixed point eventually, but more iterations to reach it.

This kind of phase ordering is one of the long-standing research problems in compiler engineering — see Whitfield & Soffa (1997) for a formal treatment.