03 — Builders, Insertion Points, and Op Creation

In LLVM you have IRBuilder<>. In MLIR you have OpBuilder. In cp-18 you have mlf::Builder. All three serve the same purpose: encapsulate the "where am I currently inserting?" cursor so op-construction calls can stay short.

mlf::Builder b;
b.setInsertionPointToEnd(funcBody);
Value* lhs = b.create("tiny.const", {}, {i64Ty}, {{"value", Attr::integer(6)}})
             ->result(0);
Value* rhs = b.create("tiny.const", {}, {i64Ty}, {{"value", Attr::integer(7)}})
             ->result(0);
b.create("tiny.mul", {lhs, rhs}, {i64Ty});

Each create allocates an Op, populates its operands/results/attributes, splices it into the current block at the insertion point, and returns a raw pointer (ownership rests with the region's opStore).

Three insertion modes

  • setInsertionPointToStart(block) — prepend new ops.
  • setInsertionPointToEnd(block) — append new ops (the common case).
  • setInsertionPointBefore(op) — insert immediately before a known op (the constant folder uses this to splice the folded const).

Why insertion-point APIs and not "just append"?

Because rewriters need to insert in the middle. The constant folder finds a tiny.add op, computes the folded result, and emits a new tiny.const right before the old add. That fresh const needs to land between the last constant and the add — not at the end of the block.

Builder bld;
bld.setInsertionPointBefore(op);
Op* foldedConst = bld.create("tiny.const", {}, ..., {{"value", ...}});
replaceAllUses(moduleOp, op->result(0), foldedConst->result(0));
op->parent->eraseOp(op);

The four-line pattern — point, create, replace, erase — is the entire shape of rewrite-based optimisation.

SSA name management

Every op result gets a name like %0, %1, %2 from a counter in the builder. The counter is per-builder, which means a fresh builder gives fresh names — useful for nested function bodies. The names are only for printing; the IR's identity is the Value* pointer.

Real MLIR does the same: SSA names in textual IR are reconstructed at print time from an AsmState that walks the op tree assigning fresh names. The in-memory IR uses pointer identity.

Op result vs value

A subtle but important distinction:

  • Op* is the operation — the thing with a name, attributes, regions.
  • Value* is one of its results — what an operand points at.

op->result(0) returns the first result Value. You almost always pass Value* (not Op*) into other ops' operand lists. cp-18's API forces this: create takes vector<Value*> for operands.

What we left out

Real MLIR's OpBuilder also tracks:

  • A Listener for rewrites (so pattern drivers can be notified of changes).
  • A Location attribute attached to every created op (for diagnostics).
  • Type inference via op interfaces (SameOperandsAndResultType, etc.).

All are nice to have, none change the picture. The core abstraction is the insertion point.