Step 03 · Dialects worth knowing

A dialect is a namespace of operations + types + attributes. Some upstream ones you'll meet constantly:

DialectPurpose
builtinmodule, func.func (in older MLIR builtin.module)
funcfunc.func, func.call, func.return
arithPure integer/float math: arith.addi, arith.cmpi, ...
cfUnstructured control flow: cf.br, cf.cond_br
scfStructured control flow: scf.for, scf.if, scf.while
memrefMemory references with shape/layout
tensorImmutable value tensors
linalgHigh-level array/linear-algebra ops
vectorExplicit SIMD vectors
affinePolyhedral loops, ideal for analyses
gpu, nvvm, rocdl, spirvDevice backends
llvmMirror of LLVM IR; the terminal target

The point: write your compiler as a sequence mydialect → linalg → memref → scf → cf → llvm, each step removing abstraction you no longer need.

Defining a dialect (in C++)

You declare ops in TableGen (.td), which mlir-tblgen expands into C++ classes. A typical workflow:

  1. MinilangOps.td — declare ops, types, attributes.
  2. Register the dialect with MLIRContext::loadDialect.
  3. Implement verify, canonicalize, fold per op.
  4. Write a MinilangToLLVM conversion pass (mlir::ConversionTarget
    • RewritePatterns).

cp-18's capstone leaves the dialect implementation as a guided exercise; the heavy lifting is mostly mechanical TableGen + pattern boilerplate.