Step 5 — Lowering arithmetic and comparisons

Binary integer operations

The mapping is essentially one-to-one with TAC:

TAC Op	LLVM instruction
`Add`	`add i64 a, b`
`Sub`	`sub i64 a, b`
`Mul`	`mul i64 a, b`
`Div`	`sdiv i64 a, b`
`Mod`	`srem i64 a, b`
`And`	`and i64 a, b`
`Or`	`or i64 a, b`

Signed vs unsigned: we use sdiv and srem (signed) because MiniLang's Number is signed-ish in spirit. The s/u/f prefix on LLVM arithmetic is a frequent source of bugs:

add — no prefix; signedness doesn't matter (two's complement).
mul — no prefix; same reason.
sdiv / udiv — different result for negative operands.
srem / urem — likewise.
fadd / fmul / fdiv — floating point.
shl — no prefix; lshr (logical) / ashr (arithmetic) for right shift.

The nsw / nuw flags (no signed wrap / no unsigned wrap) on arithmetic let the optimiser assume overflow is impossible. We don't emit them — being conservative — but a real frontend should track this from the source language's overflow semantics.

Unary

Neg becomes sub i64 0, %a. There is no dedicated neg instruction.
Not (boolean negation) becomes icmp eq i64 %a, 0 followed by zext i1 ... to i64.

Comparisons

%v0 = icmp slt i64 %a, %b       ; signed less-than
%v1 = zext i1   %v0 to i64

icmp returns i1. To use the result as our uniform i64 value, we zext (zero-extend) to i64. If we stored booleans as i1 throughout we wouldn't need the zext — but every other operation would then need to widen back to i64 for arithmetic.

The condition mnemonics:

TAC	LLVM `icmp` cond
`Eq`	`eq`
`Ne`	`ne`
`Lt`	`slt`
`Le`	`sle`
`Gt`	`sgt`
`Ge`	`sge`

The s prefix is for signed comparison. ult, ule, etc. are unsigned. eq and ne don't have a sign because they don't need one — bitwise equality is the same either way.

The zext / trunc dance

icmp always produces i1. Storing or arithmetic always wants i64. Branching on a value always wants i1 again.

%v0 = icmp slt i64 %a, %b      ; i1
%v1 = zext i1   %v0 to i64     ; i64

; ... later, used as a branch condition: ...
%v2 = icmp ne i64 %v1, 0       ; back to i1
br i1 %v2, label %T, label %F

This back-and-forth is what you pay for using i64 as the uniform value type. LLVM's instcombine cleans most of it up:

icmp ne (zext T to i64), 0   →   T

So after opt -O1 the i64 round-trip vanishes entirely.

Why we don't use `fadd` / `fmul`

MiniLang numbers are doubles in the interpreter, but we lower to i64 for simplicity. To handle floats properly:

Pick double as the uniform type instead of i64.
Replace add → fadd, sdiv → fdiv, icmp → fcmp.
fcmp predicates have an ordered/unordered distinction (oeq, ueq, olt, ult, ...) because NaN can fail every comparison.
Print with %g or %lf format.

cp-14 will introduce a tagged value type that handles both i64 and double, with a runtime dispatch on the tag bits.

Compilers & Parser Engineer