Step 07 · From here to production
minilangc is a complete compiler — small, but every stage that a
real production compiler has is present. What separates it from
something you'd ship?
Language features
- Types. Booleans, strings, structs, arrays. Each one threads through lex → parse → typecheck → IR. Strings need a runtime (cp-14).
- Closures. Capture-by-reference vs. by-value, free variable analysis, environment lowering.
- Modules / namespaces. Multi-file compilation, separate
compilation units, an
importstatement, a build manifest. - Generics. Monomorphisation (Rust) or boxing (Java). Either way, the typechecker grows substantially.
Performance
- Link with LLVM-as-a-library to avoid process overhead. We'd
replace
buildExecutablewith code that builds anllvm::Moduledirectly (cp-10/11/12 already do this). - Incremental compilation. Hash AST nodes, cache IR per function, re-emit only changed functions. Rust's query system, Swift's modular header maps.
- Parallel compilation. One thread per function, share an
immutable AST. LLVM's
Moduleis per-thread butLLVMContextcan be. - Optimisation passes. Run a custom pipeline: mem2reg, instcombine,
GVN, licm, loopvectorize, before
llc.
Tooling ecosystem
minilangc fmt(re-emit canonical source) — port cp-15's formatter.minilangc test(discoverfn test_*and run them).minilangc doc(extract doc comments).minilangc lsp— full LSP server using the spans + diagnostics we built.- Debugger support — emit DWARF line tables (
!dbgin IR,DICompileUnitmetadata,-g).
Distribution
- Pre-compiled standard library distributed as object files (or LLVM bitcode for cross-target).
- Package manager. Cargo, npm, go modules — all evolved alongside their compilers.
- Cross-compilation. Parameterise the triple, ship multiple
llcbackends.
Where the curriculum goes next
- cp-17 (capstone JIT) demonstrates the dynamic-language path:
parse → IR → ORC JIT → call into a runtime (cp-14) at runtime.
No object files, no
clang. Same frontend. - cp-18 (capstone MLIR) demonstrates the high-level-IR path: parse → custom MLIR dialect → progressive lowering → LLVM dialect → object. More machinery for more optimisation headroom.
All three capstones share the cp-15 frontend skeleton. That's the deepest lesson of the curriculum: the compiler is a frontend + a backend choice, and the backend choice depends on the deployment story you want.