Step 05 · Emitting LLVM IR

llvm_emit.cpp is the longest file in this lab, ~150 lines, because it covers nine constructs (literal, variable, neg, binop, cmp, call, if, while, return) plus the module preamble. Highlights:

Memory model

We use the classic mem2reg-friendly pattern: every variable is an alloca and every read/write goes through load/store:

%x.addr = alloca i64
store i64 0, ptr %x.addr
%t = load i64, ptr %x.addr

The frontend never has to compute SSA itself. LLVM's mem2reg pass at -O1 and above promotes the allocas to virtual registers and inserts phi nodes where needed. This separation of concerns (frontend allocates, optimiser promotes) is one of the most important architectural ideas in modern compilers.

Comparisons are i64-valued booleans

%cb = icmp slt i64 %a, %b
%v  = zext i1 %cb to i64

Everything is i64, including booleans. if (...) then re-truncates via icmp ne i64 %v, 0. Wasteful? Yes. Compatible with the rest of the language? Also yes. A real bool type would be cheaper but requires propagating types through every operator.

Functions

define i64 @add(i64 %arg0, i64 %arg1) {
entry:
  %a.addr = alloca i64
  store i64 %arg0, ptr %a.addr
  %b.addr = alloca i64
  store i64 %arg1, ptr %b.addr
  ...
  ret i64 0      ; fallback if no explicit return
}

Each parameter gets spilled to an alloca of the same name. After mem2reg these vanish. The trailing ret i64 0 guarantees every function ends with a terminator even if the user omits return — defensive but not wrong.

Module preamble

target triple = "arm64-apple-macosx"
@.fmt = private unnamed_addr constant [6 x i8] c"%lld\0A\00"
declare i32 @printf(ptr, ...)

The hard-coded triple is for the macOS-on-ARM workstation this lab was developed on. A portable driver would either: (a) drop the triple and let llc pick the host default, or (b) call llvm::sys::getDefaultTargetTriple() via the LLVM-as-a-library route. We chose explicit because it documents the assumed target.

What we don't do

  • SSA construction (mem2reg handles it).
  • Register allocation (LLVM backend).
  • Instruction selection (LLVM backend).
  • Linking object files (clang invokes ld).

We're orchestrating, not reinventing. That's what "use LLVM" buys you.