Step 01 · LLVM C++ API tour

LLVM's C++ API is huge. The codegen path we touch is a thin slice:

LLVMContext   ── owns types, constants, metadata. One per thread of work.
   │
   ▼
Module        ── translation unit. Holds globals + functions + metadata.
   │
   ▼
Function      ── signature + list of BasicBlocks.
   │
   ▼
BasicBlock    ── straight-line list of Instructions; ends in a terminator.
   │
   ▼
Instruction   ── created via IRBuilder, never `new`.
Value         ── base of everything (Constant, Argument, Instruction).
Type          ── obtained from the Context (Int64Ty, etc.).

Linking

llvm_map_components_to_libnames(LLVM_LIBS core support irreader passes analysis transformutils scalaropts instcombine) expands to the static archives we need. With Homebrew LLVM 20 you can also link the umbrella LLVM library, but listing components keeps the binary small.

find_package(LLVM REQUIRED CONFIG)
target_include_directories(mllib PUBLIC ${LLVM_INCLUDE_DIRS})
target_compile_definitions(mllib PUBLIC ${LLVM_DEFINITIONS})
target_link_libraries(mllib PUBLIC ${LLVM_LIBS})

Ownership

unique_ptr<LLVMContext> must outlive unique_ptr<Module>. Module destruction touches its types, which live in the context. Keep both in the same CodegenResult and let RAII handle order — declared in the right sequence in llvm_codegen.hpp.

A common pitfall

Forward-declaring llvm::Module in a header and putting unique_ptr<Module> in a struct breaks because the implicit destructor needs the complete type. Either include <llvm/IR/Module.h> in the header, or declare an out-of-line destructor. We chose the include.