Step 7 — Printer and debugging

A pretty-printer is the single highest-leverage tool in any compiler codebase. It is the difference between guessing what your IR looks like and seeing it. Every test in this lab compares the printed IR against expected substrings; every pass in cp-09+ will use the printer to log "before" / "after" snapshots.

The format

fn @<name>(<params>) {
bb0 (<label>):
    <instr>
    <instr>
    ...
bb1 (<label>):
    ...
}
  • Functions are introduced with fn @name(...) (the @ sigil mirrors LLVM globals).
  • Parameters are named operands: %a, %b.
  • Blocks render their label in parens for readability; the integer id is the primary identifier.
  • Instructions are indented four spaces.
  • No blank lines inside a function, one blank line between functions.

Operand syntax

FormNotation
Tempt<n>
Named%<name>
Global ref@<name>
Constant int42
Constant str"hello"
Constant nilnil
None_

Constants delegate to Value::toString(), the same formatter the cp-07 VM uses for print. That gives us a single source of truth for literal representation.

Instruction syntax

t0 = add %x, 1                  binop with explicit dst, two srcs
%x  = 1                         move into named local
stg @x, t1                      store to global
t2  = ldg @x                    load from global
print t0                        side-effect, no dst
t0 = call @add(3, 4)            direct call
t0 = call <indirect>(t1)        indirect call (cp-12)
cjmp t0, bb3, bb4               conditional branch
jmp  bb5                        unconditional branch
ret  t0                         return with value
ret                             return without value

We chose = over := because it matches LLVM textual IR and reads more naturally. Comparisons render with mnemonic ops (lt, ge) rather than C-style symbols (<, >=) so that print a < b doesn't get confusing.

Why this matters

When cp-09's mem2reg pass turns

%x = 1
%x = add %x, 1
print %x

into

t10 = 1
t11 = add t10, 1
print t11

we want that diff to be a one-line change in a golden test. String-level printer assertions are a coarse tool but they catch regressions in lowering exactly when humans care about them — when the printed IR changes shape.

Debugging tactics

  • Pipe through mltac. echo '...' | ./build/mltac is the fastest feedback loop for "what does this lower to?".
  • Look at the unreachable blocks. Stray unreachable: blocks in the output often indicate the lowering forgot to advance to a join block — a sign of a missing setBlock(joinId).
  • Check the label hints. if.cont, while.body, and.join are deliberately chosen to make IR readable in the absence of source lines. If you see unreachable where you expected if.cont, the order of operations is off.

The printer has no semantic content — it's pure formatting. But it is the most-read file in the IR layer. Spend time on it. Future you will be grateful.