03 — Builders, Insertion Points, and Op Creation
In LLVM you have IRBuilder<>. In MLIR you have OpBuilder. In cp-18 you
have mlf::Builder. All three serve the same purpose: encapsulate the
"where am I currently inserting?" cursor so op-construction calls can stay
short.
mlf::Builder b;
b.setInsertionPointToEnd(funcBody);
Value* lhs = b.create("tiny.const", {}, {i64Ty}, {{"value", Attr::integer(6)}})
->result(0);
Value* rhs = b.create("tiny.const", {}, {i64Ty}, {{"value", Attr::integer(7)}})
->result(0);
b.create("tiny.mul", {lhs, rhs}, {i64Ty});
Each create allocates an Op, populates its operands/results/attributes,
splices it into the current block at the insertion point, and returns a
raw pointer (ownership rests with the region's opStore).
Three insertion modes
setInsertionPointToStart(block)— prepend new ops.setInsertionPointToEnd(block)— append new ops (the common case).setInsertionPointBefore(op)— insert immediately before a known op (the constant folder uses this to splice the folded const).
Why insertion-point APIs and not "just append"?
Because rewriters need to insert in the middle. The constant folder finds
a tiny.add op, computes the folded result, and emits a new tiny.const
right before the old add. That fresh const needs to land between the
last constant and the add — not at the end of the block.
Builder bld;
bld.setInsertionPointBefore(op);
Op* foldedConst = bld.create("tiny.const", {}, ..., {{"value", ...}});
replaceAllUses(moduleOp, op->result(0), foldedConst->result(0));
op->parent->eraseOp(op);
The four-line pattern — point, create, replace, erase — is the entire shape of rewrite-based optimisation.
SSA name management
Every op result gets a name like %0, %1, %2 from a counter in the
builder. The counter is per-builder, which means a fresh builder gives
fresh names — useful for nested function bodies. The names are only for
printing; the IR's identity is the Value* pointer.
Real MLIR does the same: SSA names in textual IR are reconstructed at
print time from an AsmState that walks the op tree assigning fresh names.
The in-memory IR uses pointer identity.
Op result vs value
A subtle but important distinction:
Op*is the operation — the thing with a name, attributes, regions.Value*is one of its results — what an operand points at.
op->result(0) returns the first result Value. You almost always pass
Value* (not Op*) into other ops' operand lists. cp-18's API forces
this: create takes vector<Value*> for operands.
What we left out
Real MLIR's OpBuilder also tracks:
- A
Listenerfor rewrites (so pattern drivers can be notified of changes). - A
Locationattribute attached to every created op (for diagnostics). - Type inference via op interfaces (
SameOperandsAndResultType, etc.).
All are nice to have, none change the picture. The core abstraction is the insertion point.