Step 02 · Value representation
Our encoding (defined in value.hpp):
bit 63 ........................... 3 2 1 0
[ 63-bit signed int ] 1 → fixnum
[ 0 ][ 0 1 0 ] → nil
[ 0 ][ 1 1 0 ] → true
[ 0 ][ 1 0 1 0] → false
[ pointer to Object, aligned to 8 ] 000 → heap object
How tests look
bool isFixnum() const { return raw & 1; }
bool isObject() const { return (raw & 0b111) == 0 && raw != 0; }
Each is a single and + compare, branch-prediction friendly.
Encoding fixnums
static Value Fixnum(int64_t v) {
return {(uint64_t)((v << 1) | 1)};
}
int64_t asFixnum() const { return ((int64_t)raw) >> 1; }
We lose one bit of range. For MiniLang's scripting niche, ±2⁶² is
plenty. Real languages that need full 64-bit integers either box big
numbers (CPython, OCaml Int64.t) or use NaN-boxing (described
below).
NaN-boxing — the alternative
IEEE-754 doubles have 52 payload bits in quiet NaNs, enough to encode a pointer + a small tag. SpiderMonkey, JSC, and Lua 5.3 use variants of this trick:
double: [sign 1][exponent 11][mantissa 52]
A double is a quiet NaN iff exponent = all 1s AND mantissa MSB = 1.
We hide 51 bits of payload + 3 tag bits in there.
Pros: full IEEE doubles fly without boxing — huge for numeric code. Cons: bit-twiddling is finicky, hostile to debuggers, doesn't play nicely with sanitisers.
Our scheme (low-bit tagging) is simpler and integer-friendly; we'd swap to NaN-boxing only if floats became a major workload.
Why aligned-to-8 pointers
std::malloc already returns 8-aligned blocks on every mainstream
platform. We additionally round our allocation sizes up to 8 so
the next object also lands aligned. The low 3 bits of any Object*
we hand out are guaranteed zero → safe to overlay tags.