Step 5 — Lowering statements and control flow

Where expression lowering produced an Operand, statement lowering produces blocks. The recipe is always the same:

  1. allocate the blocks you'll need,
  2. emit a terminator into the current block to enter the structure,
  3. lower the body into the relevant blocks,
  4. emit terminators stitching them together,
  5. set currentBlock to the join block so the next statement continues there.

if / else

        ┌─── cjmp cond ──→ if.then ──jmp── if.cont
   pre ─┤                                    ↑
        └─── cjmp cond ──→ if.else ──jmp────┘

Code:

void Builder::visit(IfStmt& s) {
    Operand cond = eval(*s.cond);

    auto& thenB = fn().newBlock("if.then");
    auto& elseB = fn().newBlock(s.elseBranch ? "if.else" : "if.cont");
    int thenId = thenB.id, elseId = elseB.id;
    emitCondJump(cond, thenId, elseId, s.line);

    setBlock(thenId);
    s.thenBranch->accept(*this);
    bool thenTerm = currentBlockTerminated();

    if (s.elseBranch) {
        auto& cont = fn().newBlock("if.cont");
        int joinId = cont.id;
        if (!thenTerm)            emitJump(joinId, s.line);
        setBlock(elseId);
        s.elseBranch->accept(*this);
        if (!currentBlockTerminated()) emitJump(joinId, s.line);
        setBlock(joinId);
    } else {
        if (!thenTerm) emitJump(elseId, s.line);
        setBlock(elseId);   // elseB *is* the join when no else exists
    }
}

Two subtleties:

  1. When there is no else, we reuse elseB as the join (if.cont). Wasting a block is harmless but ugly in diff tests, and merging them gives the more natural printed IR.
  2. We only emit the jump to the join if the branch didn't already terminate (e.g. with return). Without this guard you'd get a ret followed by a jmp, which is malformed: a block may have only one terminator.

while

pre ──jmp── while.cond ──cjmp── while.body ──jmp── while.cond (back-edge)
                  │
                  └──cjmp── while.cont
void Builder::visit(WhileStmt& s) {
    auto& condB = fn().newBlock("while.cond");
    auto& bodyB = fn().newBlock("while.body");
    auto& contB = fn().newBlock("while.cont");
    int condId = condB.id, bodyId = bodyB.id, contId = contB.id;

    emitJump(condId, s.line);            // pre  → cond

    setBlock(condId);
    Operand c = eval(*s.cond);           // (re-evaluated each iteration)
    emitCondJump(c, bodyId, contId, s.line);

    setBlock(bodyId);
    s.body->accept(*this);
    if (!currentBlockTerminated()) emitJump(condId, s.line);  // back-edge

    setBlock(contId);
}

The back-edge is what makes this a loop in CFG terms: an edge whose target dominates its source. cp-09's loop-detection pass will find it.

block and scoping

void Builder::visit(BlockStmt& s) {
    beginScope();
    for (auto& st : s.body) st->accept(*this);
    endScope();
}

A block does not introduce its own basic block. Scope and block are orthogonal — a single { } may contain several BBs (because of an embedded if), and a single BB may span several { } (because the inner block had no control flow). Conflating the two is a beginner's mistake worth flagging.

Variables declared inside { } are recorded in a local scope stack used only for the isLocal predicate. No backing storage is emitted; the local is a named operand.

return

void Builder::visit(ReturnStmt& s) {
    if (ctx().isScript) { error(s.line, "'return' outside a function"); return; }
    Operand v = s.value ? eval(*s.value) : Operand::none();
    emitReturn(v, s.line);
}

We deliberately reject top-level returns even though the resolver probably already did — defence in depth.

print

void Builder::visit(PrintStmt& s) {
    Operand v = eval(*s.expr);
    emit({Op::Print, Operand::none(), {v}, "", -1, -1, s.line});
}

print is the language's only built-in side effect, so it gets a dedicated opcode rather than going through Call. Treating it as Call @print would be cleaner but would force the interpreter and the LLVM backend to special-case the name later. A dedicated opcode is more honest.

fn declarations

A fn declaration opens a new Function, lowers its body, and queues the function into nestedFns_ to be appended to the module after the script. Nested fns (declared inside another fn) are rejected — they'd require closure capture, which lands in cp-12.