Step 2 — Operands and instructions
Operand
struct Operand {
enum class Kind { None, Temp, Constant, Named };
Kind kind = Kind::None;
int tempId = -1; // Temp
Value constVal; // Constant
std::string name; // Named (includes leading sigil-less form)
// factories: none(), temp(id), constant(v), named(name)
};
We use a single struct with a Kind tag rather than std::variant to
keep the struct trivially copyable and (more importantly) easy to print
in a debugger. When you're chasing an IR bug at 1 a.m. you want
p ins.srcs[0] to show something, not a variant index.
tempId, constVal, and name are independent fields; only one is
meaningful for any given Kind. The constructors mirror that:
Operand::temp(3); // t3
Operand::constant(Value::makeInt(42)); // immediate
Operand::named("x"); // %x
Operand::none(); // placeholder
Op — the opcode enum
| Group | Opcodes |
|---|---|
| arithmetic | Add Sub Mul Div Mod Neg |
| comparison | Eq Ne Lt Le Gt Ge |
| logical | Not (and/or are lowered, not opcodes) |
| move/load | Move LoadGlobal StoreGlobal |
| control | Jump CondJump Return |
| effects | Print Call |
Notable design choices:
- No
And/Oropcode. Short-circuit semantics demand control flow; we lower them toCondJump(see step 6). Moverather thanCopy. Same idea as RISC-V or MIPS pseudo-ops: one instruction that says "write the source into the destination, unchanged." The mem2reg pass in cp-09 will eliminate most of these.Callis a regular instruction. It has a destination temp (for the return value), an opcode-level callee name inins.namefor direct calls, and operands[callee, arg0, arg1, ...]. Indirect calls (cp-12 closures) will store<indirect>in the name and usesrcs[0]for the callee operand.
Instr
struct Instr {
Op op;
Operand dst; // None if the op produces no value
std::vector<Operand> srcs; // 0..N source operands
std::string name; // global name / function name
int bbT = -1; // jmp target / cjmp true target
int bbF = -1; // cjmp false target
int line = 0; // source line for diagnostics
};
One struct fits all instruction kinds. The alternative — a
discriminated hierarchy with AddInstr, JumpInstr, CallInstr, ... —
is dogmatically purer, but cripplingly painful to walk in passes. Every
pass would need a giant visitor or a type-switch. A flat struct lets
passes loop over instrs and switch on ins.op.
The cost: each instruction carries unused fields. For TAC at this scale
that's a sub-megabyte overhead even for large programs, and it's the
shape MLIR uses (an Operation* with attributes, results, operands,
successors). Compiler IRs converge on this design for a reason.
Why constants are inline operands
In some IRs (notably LLVM) constants are first-class Values, distinct
from instructions. We took the simpler route: a constant is just an
Operand::Constant, printed inline. Pros: trivial printer, no constant
pool to manage. Cons: you can't dyn_cast a constant the way you can
in LLVM. For a teaching IR that's the right trade.