02 — Building IR with IRBuilder
cp-16 produced LLVM IR as text. Strings concatenated into a .ll file
which llc then parsed back into an in-memory Module. That round-trip is
fine for AOT (the textual form is great for debugging) but it's slow and lossy
for JIT.
In cp-17, ir_emit.cpp constructs the Module directly:
LLVMContext ctx;
Module mod("test", ctx);
IRBuilder<> b(ctx);
Every IR node is a C++ object. The IRBuilder tracks a current insertion point
(a BasicBlock) and appends instructions to it. Compare:
| operation | textual IR | IRBuilder call |
|---|---|---|
| add two i64 | %t = add i64 %a, %b | b.CreateAdd(a, c) |
| signed less-than | %t = icmp slt i64 %a, %b | b.CreateICmpSLT(a, c) |
| call printf | call void @printf(...) | b.CreateCall(fn, args) |
| return value | ret i64 %v | b.CreateRet(v) |
| branch | br label %L | b.CreateBr(L) |
| cond br | br i1 %c, label %T, label %F | b.CreateCondBr(c, T, F) |
The Value* that each Create* returns is the IR-level result you splice
into subsequent operations. You're building a directed graph of SSA values,
just with C++ syntax instead of .ll text.
Locals as alloca slots
We keep the simple model from earlier labs: every local is an alloca slot
named <name>.addr, loaded on read, stored on write. mem2reg (run by the
default LLJIT pipeline) promotes them to SSA registers. This means the
emitter never has to track SSA names or phi nodes.
auto* slot = b.CreateAlloca(i64(), nullptr, "x.addr");
b.CreateStore(value, slot);
// later:
auto* v = b.CreateLoad(i64(), slot);
Control flow via named basic blocks
For if/while we explicitly create blocks and stitch branches:
auto* T = BasicBlock::Create(ctx, "then", fn);
auto* E = BasicBlock::Create(ctx, "else", fn);
auto* M = BasicBlock::Create(ctx, "end", fn);
b.CreateCondBr(cond_i1, T, E);
b.SetInsertPoint(T);
// emit `then` body...
if (!b.GetInsertBlock()->getTerminator()) b.CreateBr(M);
The terminator check matters: if the then body ended with return, the
block is already terminated and we must NOT append a second terminator (LLVM's
verifier will reject the module). That single rule is responsible for most of
the conditional if (!terminator) br calls in ir_emit.cpp.
verifyModule
After emitting we call llvm::verifyModule. If it returns true, the IR is
malformed: dangling references, missing terminators, type mismatches, etc.
We capture the report and surface it as EmitResult::error. This is the
guardrail against bugs in your emitter. Catching a verifier error is a
millisecond; catching a "JIT executed bad machine code" error is a debugger
session at best.