Step 5 — Lowering arithmetic and comparisons
Binary integer operations
The mapping is essentially one-to-one with TAC:
| TAC Op | LLVM instruction |
|---|---|
Add | add i64 a, b |
Sub | sub i64 a, b |
Mul | mul i64 a, b |
Div | sdiv i64 a, b |
Mod | srem i64 a, b |
And | and i64 a, b |
Or | or i64 a, b |
Signed vs unsigned: we use sdiv and srem (signed) because
MiniLang's Number is signed-ish in spirit. The s/u/f prefix
on LLVM arithmetic is a frequent source of bugs:
add— no prefix; signedness doesn't matter (two's complement).mul— no prefix; same reason.sdiv/udiv— different result for negative operands.srem/urem— likewise.fadd/fmul/fdiv— floating point.shl— no prefix;lshr(logical) /ashr(arithmetic) for right shift.
The nsw / nuw flags (no signed wrap / no unsigned wrap) on
arithmetic let the optimiser assume overflow is impossible. We don't
emit them — being conservative — but a real frontend should track this
from the source language's overflow semantics.
Unary
Negbecomessub i64 0, %a. There is no dedicatedneginstruction.Not(boolean negation) becomesicmp eq i64 %a, 0followed byzext i1 ... to i64.
Comparisons
%v0 = icmp slt i64 %a, %b ; signed less-than
%v1 = zext i1 %v0 to i64
icmp returns i1. To use the result as our uniform i64 value, we
zext (zero-extend) to i64. If we stored booleans as i1
throughout we wouldn't need the zext — but every other operation
would then need to widen back to i64 for arithmetic.
The condition mnemonics:
| TAC | LLVM icmp cond |
|---|---|
Eq | eq |
Ne | ne |
Lt | slt |
Le | sle |
Gt | sgt |
Ge | sge |
The s prefix is for signed comparison. ult, ule, etc. are
unsigned. eq and ne don't have a sign because they don't need
one — bitwise equality is the same either way.
The zext / trunc dance
icmp always produces i1. Storing or arithmetic always wants
i64. Branching on a value always wants i1 again.
%v0 = icmp slt i64 %a, %b ; i1
%v1 = zext i1 %v0 to i64 ; i64
; ... later, used as a branch condition: ...
%v2 = icmp ne i64 %v1, 0 ; back to i1
br i1 %v2, label %T, label %F
This back-and-forth is what you pay for using i64 as the uniform
value type. LLVM's instcombine cleans most of it up:
icmp ne (zext T to i64), 0 → T
So after opt -O1 the i64 round-trip vanishes entirely.
Why we don't use fadd / fmul
MiniLang numbers are doubles in the interpreter, but we lower to
i64 for simplicity. To handle floats properly:
- Pick
doubleas the uniform type instead ofi64. - Replace
add→fadd,sdiv→fdiv,icmp→fcmp. fcmppredicates have an ordered/unordered distinction (oeq,ueq,olt,ult, ...) because NaN can fail every comparison.- Print with
%gor%lfformat.
cp-14 will introduce a tagged value type that handles both i64 and double, with a runtime dispatch on the tag bits.