cp-18 — Capstone: An MLIR-Style Compiler Framework

A self-contained, ~700-line reimplementation of MLIR's core ideas — Operations, Regions, Blocks, Values, Types, Attributes, Builders, Passes, and conversion between dialects — with zero LLVM/MLIR dependency. Two demonstration dialects (tiny.* and ll.*), a constant-folder, a DCE pass, and a lowering pass that rewrites tiny.* into ll.*.

The point isn't to use MLIR; it's to understand its architecture by rebuilding the skeleton. After cp-18 you can read MLIR source code and recognise every concept.

Build & test

cd src/cpp
cmake -S . -B build && cmake --build build
./build/tests/test_mlf      # 35/35 checks passed
./build/mlfdriver --tiny -   # parse stdin, print tiny.* IR
./build/mlfdriver --opt  -   # ... after fold+DCE
./build/mlfdriver        -   # ... lowered to ll.*

Example session:

$ echo "let x = 2 * 3 + 1; print x;" | ./build/mlfdriver
"module"() {
  "ll.func"() {sym_name = "main"} {
    %0 = "ll.const"() {value = 7} : () -> (i64)
    "ll.call"(%0) {callee = "ml_print_int"} : (i64) -> ()
    %1 = "ll.const"() {value = 0} : () -> (i64)
    "ll.ret"(%1)
  }
}

Layout

  • src/cpp/src/mlf.{hpp,cpp} — the framework: Op, Region, Block, Value, Builder, walks.
  • src/cpp/src/dialects.{hpp,cpp}tiny.* and ll.* op constructors.
  • src/cpp/src/passes.{hpp,cpp} — Pipeline + constantFold + DCE.
  • src/cpp/src/lowering.{hpp,cpp} — tiny → ll dialect conversion.
  • src/cpp/src/printer.{hpp,cpp} — MLIR-flavoured IR printer.
  • src/cpp/src/parser.{hpp,cpp} — tiny surface language → tiny.* IR.
  • src/cpp/src/main.cppmlfdriver CLI.
  • src/cpp/tests/test_mlf.cpp — 7 tests, 35 checks.
  • steps/01..07.md — narrative.

Mapping to real MLIR

cp-18MLIR equivalent
mlf::Opmlir::Operation
mlf::Regionmlir::Region
mlf::Blockmlir::Block
mlf::Valuemlir::Value
mlf::Typemlir::Type
mlf::Attributemlir::Attribute
mlf::Buildermlir::OpBuilder
mlf::pass::Pipelinemlir::PassManager
mlf::convert::lowerTinyToLLdialect-conversion pass with rewrite patterns

Tests

  1. Hand-build a module via Builder.
  2. constantFold shrinks 1 + 2 + 36.
  3. DCE after folding deletes the now-dead literals.
  4. DCE preserves a const used by tiny.print.
  5. Lowering: zero tiny.* ops remain after lowerTinyToLL.
  6. End-to-end: parse → fold → DCE → lower → check the lowered IR.
  7. Parser surfaces a clear error for malformed input.