Step 5 — Lowering statements and control flow
Where expression lowering produced an Operand, statement lowering
produces blocks. The recipe is always the same:
- allocate the blocks you'll need,
- emit a terminator into the current block to enter the structure,
- lower the body into the relevant blocks,
- emit terminators stitching them together,
- set
currentBlockto the join block so the next statement continues there.
if / else
┌─── cjmp cond ──→ if.then ──jmp── if.cont
pre ─┤ ↑
└─── cjmp cond ──→ if.else ──jmp────┘
Code:
void Builder::visit(IfStmt& s) {
Operand cond = eval(*s.cond);
auto& thenB = fn().newBlock("if.then");
auto& elseB = fn().newBlock(s.elseBranch ? "if.else" : "if.cont");
int thenId = thenB.id, elseId = elseB.id;
emitCondJump(cond, thenId, elseId, s.line);
setBlock(thenId);
s.thenBranch->accept(*this);
bool thenTerm = currentBlockTerminated();
if (s.elseBranch) {
auto& cont = fn().newBlock("if.cont");
int joinId = cont.id;
if (!thenTerm) emitJump(joinId, s.line);
setBlock(elseId);
s.elseBranch->accept(*this);
if (!currentBlockTerminated()) emitJump(joinId, s.line);
setBlock(joinId);
} else {
if (!thenTerm) emitJump(elseId, s.line);
setBlock(elseId); // elseB *is* the join when no else exists
}
}
Two subtleties:
- When there is no
else, we reuseelseBas the join (if.cont). Wasting a block is harmless but ugly in diff tests, and merging them gives the more natural printed IR. - We only emit the jump to the join if the branch didn't already
terminate (e.g. with
return). Without this guard you'd get aretfollowed by ajmp, which is malformed: a block may have only one terminator.
while
pre ──jmp── while.cond ──cjmp── while.body ──jmp── while.cond (back-edge)
│
└──cjmp── while.cont
void Builder::visit(WhileStmt& s) {
auto& condB = fn().newBlock("while.cond");
auto& bodyB = fn().newBlock("while.body");
auto& contB = fn().newBlock("while.cont");
int condId = condB.id, bodyId = bodyB.id, contId = contB.id;
emitJump(condId, s.line); // pre → cond
setBlock(condId);
Operand c = eval(*s.cond); // (re-evaluated each iteration)
emitCondJump(c, bodyId, contId, s.line);
setBlock(bodyId);
s.body->accept(*this);
if (!currentBlockTerminated()) emitJump(condId, s.line); // back-edge
setBlock(contId);
}
The back-edge is what makes this a loop in CFG terms: an edge whose target dominates its source. cp-09's loop-detection pass will find it.
block and scoping
void Builder::visit(BlockStmt& s) {
beginScope();
for (auto& st : s.body) st->accept(*this);
endScope();
}
A block does not introduce its own basic block. Scope and block
are orthogonal — a single { } may contain several BBs (because of an
embedded if), and a single BB may span several { } (because the
inner block had no control flow). Conflating the two is a beginner's
mistake worth flagging.
Variables declared inside { } are recorded in a local scope stack used
only for the isLocal predicate. No backing storage is emitted; the
local is a named operand.
return
void Builder::visit(ReturnStmt& s) {
if (ctx().isScript) { error(s.line, "'return' outside a function"); return; }
Operand v = s.value ? eval(*s.value) : Operand::none();
emitReturn(v, s.line);
}
We deliberately reject top-level returns even though the resolver probably already did — defence in depth.
print
void Builder::visit(PrintStmt& s) {
Operand v = eval(*s.expr);
emit({Op::Print, Operand::none(), {v}, "", -1, -1, s.line});
}
print is the language's only built-in side effect, so it gets a
dedicated opcode rather than going through Call. Treating it as
Call @print would be cleaner but would force the interpreter and
the LLVM backend to special-case the name later. A dedicated opcode is
more honest.
fn declarations
A fn declaration opens a new Function, lowers its body, and queues
the function into nestedFns_ to be appended to the module after the
script. Nested fns (declared inside another fn) are rejected — they'd
require closure capture, which lands in cp-12.