Step 02 · The shape of MLIR
module {
llvm.func @main() -> i32 {
%0 = llvm.mlir.constant(42 : i64) : i64
%1 = llvm.mlir.addressof @fmt : !llvm.ptr
%2 = llvm.call @printf(%1, %0) vararg(!llvm.func<i32 (ptr, ...)>)
: (!llvm.ptr, i64) -> i32
%3 = llvm.mlir.constant(0 : i32) : i32
llvm.return %3 : i32
}
}
Key concepts:
- Operation — every line is an
Operation. The name carries the dialect (llvm.,arith.,func.,scf., ...). - Region — a block of
Operations, enclosed in{ ... }. Some ops (scf.for,func.func) have nested regions; that's how MLIR expresses structured control flow. - Block — a list of operations ending in a terminator. Labels are
^bb0,^bb1, .... Blocks may take SSA arguments (MLIR's unification of Φ-nodes and parameters). - Value (
%name) — SSA result of an op. - Type (
i64,!llvm.ptr,tensor<4xf32>) — typed by the dialect;!prefix means "non-builtin".
Implications
- No global symbol table for SSA — each block can reuse names.
- Every op states all its operand and result types, so the IR is
self-describing and can be parsed by
mlir-opteven without knowing the producing dialect's C++ class (provided the dialect is loaded). moduleitself is an op whose region holds the program.
Our emission strategy
Emitter::emitFunction produces a llvm.func with one entry block,
allocas for every named local, then a llvm.br ^bb1 into the first
TAC block. After that each TAC block becomes a ^bbN label and
its instructions translate one-for-one to llvm.* ops.