Step 7 — Runtime ABI: printf and main

Why printf?

MiniLang has a built-in print. To execute the emitted module, some piece of code outside it has to do the formatting and the actual system call. The cheapest option is to delegate to libc's printf, which lli, llc + ld, and any host linker make trivially available.

@.fmt = private constant [6 x i8] c"%lld\0A\00"
declare i32 @printf(i8*, ...)
  • @.fmt is a private (module-local) constant holding the bytes %lld\n\0. The [6 x i8] array type makes the length explicit; the null terminator at the end matches C string convention.
  • declare i32 @printf(i8*, ...) introduces an external function declaration. The variadic ... is part of the type signature.

The call site

%v_ = call i32 (i8*, ...) @printf
       (i8* getelementptr ([6 x i8], [6 x i8]* @.fmt, i64 0, i64 0),
        i64 %v)

Three things here are non-obvious:

  1. call i32 (i8*, ...) @printf — the function type appears parenthesised after the return type. This is required only for variadic calls. Non-variadic calls can omit it: call i32 @add(...).

  2. The GEP@.fmt is an [6 x i8]*. printf wants i8*. getelementptr with two zero indices computes "address of the first byte". This is the canonical "decay an array to a pointer" pattern in LLVM.

  3. The discarded return value — we assign it to %v_ even though we never use it. LLVM doesn't allow standalone calls to be silently dropped; you must give the result some name. (Or call void-returning functions, where naming is forbidden.)

main

LLVM doesn't define what main is — that's a libc/runtime convention. We adopt the C convention:

define i32 @main() {
  ...
  ret i32 0
}

@main returns i32 (the process exit status). When we lower our __script__ function, we use define i32 @main() and the final Op::Return becomes ret i32 0.

lli looks for @main to start execution. llc + system linker build an executable that the OS loader calls via _start → libc init → main.

What's missing

We have no:

  • String literals at runtime — no allocation, no managed string. cp-14 introduces a runtime with ml_string_new, ml_print_value, ml_value_t.
  • Closures — function values that capture environment. cp-12 introduces them as part of the JIT capstone.
  • GC — every allocation in cp-10/11 is leaked or stack-only. cp-14 sketches a mark-sweep collector with a shadow stack.
  • Exception model — no invoke, no landingpad.
  • TLS, threading primitives, atomics — out of scope.

ABI considerations for cp-11

When cp-11 actually links against LLVM's C++ API, the ABI surface expands:

  • Calling conventionsccc (default), fastcc, coldcc, swiftcc, tailcc, custom numbered ccs. We use ccc (C calling convention) because we link with libc.
  • Attributesnoinline, readonly, nounwind, cold, optsize, ... These affect optimisation decisions.
  • Target attributestarget-cpu, target-features. The difference between scalar codegen and AVX-512 vectorised codegen.

All of these become accessible through the C++ API as we move to real LLVM integration in cp-11.