Step 05 · Emitting LLVM IR
llvm_emit.cpp is the longest file in this lab, ~150 lines, because it covers nine constructs (literal, variable, neg, binop, cmp, call, if, while, return) plus the module preamble. Highlights:
Memory model
We use the classic mem2reg-friendly pattern: every variable is an
alloca and every read/write goes through load/store:
%x.addr = alloca i64
store i64 0, ptr %x.addr
%t = load i64, ptr %x.addr
The frontend never has to compute SSA itself. LLVM's mem2reg pass at
-O1 and above promotes the allocas to virtual registers and inserts
phi nodes where needed. This separation of concerns (frontend
allocates, optimiser promotes) is one of the most important
architectural ideas in modern compilers.
Comparisons are i64-valued booleans
%cb = icmp slt i64 %a, %b
%v = zext i1 %cb to i64
Everything is i64, including booleans. if (...) then re-truncates
via icmp ne i64 %v, 0. Wasteful? Yes. Compatible with the rest of
the language? Also yes. A real bool type would be cheaper but
requires propagating types through every operator.
Functions
define i64 @add(i64 %arg0, i64 %arg1) {
entry:
%a.addr = alloca i64
store i64 %arg0, ptr %a.addr
%b.addr = alloca i64
store i64 %arg1, ptr %b.addr
...
ret i64 0 ; fallback if no explicit return
}
Each parameter gets spilled to an alloca of the same name. After
mem2reg these vanish. The trailing ret i64 0 guarantees every
function ends with a terminator even if the user omits return —
defensive but not wrong.
Module preamble
target triple = "arm64-apple-macosx"
@.fmt = private unnamed_addr constant [6 x i8] c"%lld\0A\00"
declare i32 @printf(ptr, ...)
The hard-coded triple is for the macOS-on-ARM workstation this lab
was developed on. A portable driver would either: (a) drop the triple
and let llc pick the host default, or (b) call llvm::sys::getDefaultTargetTriple()
via the LLVM-as-a-library route. We chose explicit because it
documents the assumed target.
What we don't do
- SSA construction (mem2reg handles it).
- Register allocation (LLVM backend).
- Instruction selection (LLVM backend).
- Linking object files (clang invokes
ld).
We're orchestrating, not reinventing. That's what "use LLVM" buys you.