Step 7 — Runtime ABI: printf and main
Why printf?
MiniLang has a built-in print. To execute the emitted module, some
piece of code outside it has to do the formatting and the actual
system call. The cheapest option is to delegate to libc's printf,
which lli, llc + ld, and any host linker make trivially
available.
@.fmt = private constant [6 x i8] c"%lld\0A\00"
declare i32 @printf(i8*, ...)
@.fmtis a private (module-local) constant holding the bytes%lld\n\0. The[6 x i8]array type makes the length explicit; the null terminator at the end matches C string convention.declare i32 @printf(i8*, ...)introduces an external function declaration. The variadic...is part of the type signature.
The call site
%v_ = call i32 (i8*, ...) @printf
(i8* getelementptr ([6 x i8], [6 x i8]* @.fmt, i64 0, i64 0),
i64 %v)
Three things here are non-obvious:
-
call i32 (i8*, ...) @printf— the function type appears parenthesised after the return type. This is required only for variadic calls. Non-variadic calls can omit it:call i32 @add(...). -
The GEP —
@.fmtis an[6 x i8]*.printfwantsi8*.getelementptrwith two zero indices computes "address of the first byte". This is the canonical "decay an array to a pointer" pattern in LLVM. -
The discarded return value — we assign it to
%v_even though we never use it. LLVM doesn't allow standalone calls to be silently dropped; you must give the result some name. (Or callvoid-returning functions, where naming is forbidden.)
main
LLVM doesn't define what main is — that's a libc/runtime
convention. We adopt the C convention:
define i32 @main() {
...
ret i32 0
}
@main returns i32 (the process exit status). When we lower our
__script__ function, we use define i32 @main() and the final
Op::Return becomes ret i32 0.
lli looks for @main to start execution. llc + system linker
build an executable that the OS loader calls via _start → libc
init → main.
What's missing
We have no:
- String literals at runtime — no allocation, no managed string.
cp-14 introduces a runtime with
ml_string_new,ml_print_value,ml_value_t. - Closures — function values that capture environment. cp-12 introduces them as part of the JIT capstone.
- GC — every allocation in cp-10/11 is leaked or stack-only. cp-14 sketches a mark-sweep collector with a shadow stack.
- Exception model — no
invoke, nolandingpad. - TLS, threading primitives, atomics — out of scope.
ABI considerations for cp-11
When cp-11 actually links against LLVM's C++ API, the ABI surface expands:
- Calling conventions —
ccc(default),fastcc,coldcc,swiftcc,tailcc, custom numbered ccs. We use ccc (C calling convention) because we link with libc. - Attributes —
noinline,readonly,nounwind,cold,optsize, ... These affect optimisation decisions. - Target attributes —
target-cpu,target-features. The difference between scalar codegen and AVX-512 vectorised codegen.
All of these become accessible through the C++ API as we move to real LLVM integration in cp-11.