Step 01 · LLVM C++ API tour
LLVM's C++ API is huge. The codegen path we touch is a thin slice:
LLVMContext ── owns types, constants, metadata. One per thread of work.
│
▼
Module ── translation unit. Holds globals + functions + metadata.
│
▼
Function ── signature + list of BasicBlocks.
│
▼
BasicBlock ── straight-line list of Instructions; ends in a terminator.
│
▼
Instruction ── created via IRBuilder, never `new`.
Value ── base of everything (Constant, Argument, Instruction).
Type ── obtained from the Context (Int64Ty, etc.).
Linking
llvm_map_components_to_libnames(LLVM_LIBS core support irreader passes analysis transformutils scalaropts instcombine)
expands to the static archives we need. With Homebrew LLVM 20 you can
also link the umbrella LLVM library, but listing components keeps the
binary small.
find_package(LLVM REQUIRED CONFIG)
target_include_directories(mllib PUBLIC ${LLVM_INCLUDE_DIRS})
target_compile_definitions(mllib PUBLIC ${LLVM_DEFINITIONS})
target_link_libraries(mllib PUBLIC ${LLVM_LIBS})
Ownership
unique_ptr<LLVMContext> must outlive unique_ptr<Module>. Module
destruction touches its types, which live in the context. Keep both in
the same CodegenResult and let RAII handle order — declared in the
right sequence in llvm_codegen.hpp.
A common pitfall
Forward-declaring llvm::Module in a header and putting
unique_ptr<Module> in a struct breaks because the implicit
destructor needs the complete type. Either include <llvm/IR/Module.h>
in the header, or declare an out-of-line destructor. We chose the
include.