02 — Building IR with IRBuilder

cp-16 produced LLVM IR as text. Strings concatenated into a .ll file which llc then parsed back into an in-memory Module. That round-trip is fine for AOT (the textual form is great for debugging) but it's slow and lossy for JIT.

In cp-17, ir_emit.cpp constructs the Module directly:

LLVMContext ctx;
Module      mod("test", ctx);
IRBuilder<> b(ctx);

Every IR node is a C++ object. The IRBuilder tracks a current insertion point (a BasicBlock) and appends instructions to it. Compare:

operationtextual IRIRBuilder call
add two i64%t = add i64 %a, %bb.CreateAdd(a, c)
signed less-than%t = icmp slt i64 %a, %bb.CreateICmpSLT(a, c)
call printfcall void @printf(...)b.CreateCall(fn, args)
return valueret i64 %vb.CreateRet(v)
branchbr label %Lb.CreateBr(L)
cond brbr i1 %c, label %T, label %Fb.CreateCondBr(c, T, F)

The Value* that each Create* returns is the IR-level result you splice into subsequent operations. You're building a directed graph of SSA values, just with C++ syntax instead of .ll text.

Locals as alloca slots

We keep the simple model from earlier labs: every local is an alloca slot named <name>.addr, loaded on read, stored on write. mem2reg (run by the default LLJIT pipeline) promotes them to SSA registers. This means the emitter never has to track SSA names or phi nodes.

auto* slot = b.CreateAlloca(i64(), nullptr, "x.addr");
b.CreateStore(value, slot);
// later:
auto* v = b.CreateLoad(i64(), slot);

Control flow via named basic blocks

For if/while we explicitly create blocks and stitch branches:

auto* T = BasicBlock::Create(ctx, "then", fn);
auto* E = BasicBlock::Create(ctx, "else", fn);
auto* M = BasicBlock::Create(ctx, "end",  fn);
b.CreateCondBr(cond_i1, T, E);
b.SetInsertPoint(T);
// emit `then` body...
if (!b.GetInsertBlock()->getTerminator()) b.CreateBr(M);

The terminator check matters: if the then body ended with return, the block is already terminated and we must NOT append a second terminator (LLVM's verifier will reject the module). That single rule is responsible for most of the conditional if (!terminator) br calls in ir_emit.cpp.

verifyModule

After emitting we call llvm::verifyModule. If it returns true, the IR is malformed: dangling references, missing terminators, type mismatches, etc. We capture the report and surface it as EmitResult::error. This is the guardrail against bugs in your emitter. Catching a verifier error is a millisecond; catching a "JIT executed bad machine code" error is a debugger session at best.