Chapter 20 — Crucible: The Memory Fuzzer
Chapter 18 showed you a machine that hunts for a logic
bug: --emit=check invents inputs to a function until one of its contracts turns false. Crucible
is the twin of that idea, pointed at memory. Instead of inputs to one function it invents whole
programs — entire .ig files — and runs each one through a battery of detectors that watch
what the runtime does with memory: whether a value is freed twice, leaks, is read after it has
been freed, or quietly comes back wrong. Logic correctness and memory correctness are different
problems, so Ingle hunts them with two different tools.
Crucible is a build-time developer tool. It lives in tools/ (a generator, crucible.c, and a
driver, crucible.sh), it is never shipped inside inglec, and you run the whole thing with one
command:
make crucible
That builds everything it needs and sweeps a default of 150 seeds; when you want a shorter or
longer run, call the driver directly with a count — tools/crucible.sh 120. Either way, a green
run ends like this and exits 0:
crucible: 120 seeds → 120 clean, 0 distinct (0 NEW).
crucible: ✓ no new memory faults.
A run that turns up something new exits 1 and tells you exactly where the evidence is. That is
the entire contract: a clean exit means the language could not be made to mishandle memory
across the combinations Crucible knows how to build; a non-zero exit hands you a minimal program
that proves otherwise.
Why a fuzzer, and not just more tests
Every memory bug Ingle has ever had lived at a combination of features that, taken one at a
time, all worked: a value-struct double-freed when it was borrowed into a multi-slot parameter; a
string interpolation that leaked its intermediate pieces; a field assignment through an array
index; a value-struct double-freed when shared through an erased generic. Nobody sits down and
writes a test for “a struct with a string field, stored as the value of a Map, read back
inside a loop, and then interpolated” — because nobody thinks of it. Each of those bugs was
found the hard way, reactively, when a real program happened to walk into the combination and
crashed on the way out.
Crucible’s job is to walk into those combinations first, by the hundred, automatically. The
principle the source states for itself is “no knowledge lost”: every shape that has ever
bitten the language is inside the space the generator samples, so the same class of bug cannot
come back unseen. On its first run it surfaced a bug where a Map whose value is an array handed
back a corrupted, empty view — a combination no hand-written test had thought to try.
Why this needs its own tool. Property-based testing targets your logic, the way
--emit=checkdoes. Crucible targets the runtime memory model itself: it pits two backends against each other and watches the allocator for double-frees and leaks. Ingle frees memory deterministically from tracked ownership (Chapter 13), and that deterministic freedom is exactly the thing that needs a fuzzer pointed at it.
The generator: one seed, one program
The generator takes a seed and prints one valid Ingle program to standard output. Same seed, same program, every time — so every finding is perfectly reproducible. You can watch it work:
build/crucible 1 30
The 1 is the seed; the 30 is a loop trip-count the driver later scales up and down for the
leak check. Each program is built from self-contained operations — one function per danger
pattern — and every one of them folds every value it touches into a running acc:
fn op0() -> int {
var acc = 0
var m = map.Map<string, S>{ buckets: [], count: 0 }
m.set("k0", S { a: 98, s: "z" })
m.set("k1", S { a: 53, s: "q" })
m.set("k0", S { a: 31, s: "ab" })
match m.get("k0") { case Some(v) {
acc = acc + v.a + v.s.len()
} case None {} }
match m.get("k1") { case Some(v) {
acc = acc + v.a + v.s.len()
} case None {} }
return acc
}
That one stores a struct as a Map value, overwrites a key, reads two keys back, and folds the
recovered a field and the heap string’s length into acc — so the heap leaf is exercised, not
just the scalar. main then sums every opN() and prints the total. That total is the keystone
of the whole design: it is a checksum of every value the program touched. If a value is
silently dropped, duplicated, or read back wrong, the number changes — which is how Crucible
catches wrong answers, not merely crashes. Run the seed-1 program straight through the VM:
inglec --emit=run seed1.ig
3722=> 0
The 3722 is the checksum main printed; => 0 is the value it returned. (print adds no
newline, so the two land on the same line.)
What does the generator deliberately reach for? The places memory bugs live:
- Struct shapes — all-scalar, with a heap
stringfield, with a nested struct, or with both. - Erased generic containers — those structs placed into
[T],Map<string, T>,Map<string, [T]>, andOption<T>, including nested combinations, built and read back through generic helpers likefn c_pair<T>(a: T, b: T) -> [T]andfn c_keep<T>(move x: T) -> T. - Movement — passed by
moveand by borrow, returned, read back, mutated through an array index (arr[i].field = …), and interpolated ("row{i}-x{i}"), inside loops and across reassignment.
That list is not arbitrary. It is the union of every feature-combination that has ever produced a memory bug in Ingle.
The five oracles
The driver runs each generated program through five independent detectors. In fuzzing these are called oracles, because each one knows how to recognise a particular kind of wrong. The driver builds the variant compilers it needs (a drop-trace build, an AddressSanitizer build) on demand the first time you run it.
| Oracle | What it catches | How |
|---|---|---|
| Double-drop detector | A value freed twice | A compiler built with -DEMBER_DROP_TRACE stamps a sentinel after each reclaim and aborts if it ever frees that object again |
| VM fault | A runtime fault (bad index, etc.) | Runs under the VM and looks for runtime error: … |
| ASan | Use-after-free, buffer overflow, double-free | A compiler built with AddressSanitizer (make asan) |
| RSS leak | Memory that grows super-linearly | Runs the same seed at 50 and 6000 loops and compares peak resident memory |
| VM↔native differential | A wrong answer, or a native-only crash | Compiles the program to a native binary, runs it, and compares output and exit code against the VM |
There is a sixth thing it watches for, quietly: if the generator ever emits a program that doesn’t
compile, that is a bug in the generator — an over-reach past what the language allows — and
it’s reported as gen-compile-error so it can never be mistaken for a real finding.
Why five, and why these? Because each one sees something the others can’t. The runtime’s pool
allocator recycles memory instead of calling free, so a plain ASan build can read a
use-after-free as perfectly valid (recycled) memory and miss it entirely — which is precisely why
the double-drop detector exists, stamping a sentinel that survives recycling. The differential is
the only oracle that catches a silently wrong answer: a program that runs cleanly, leaks
nothing, double-frees nothing, and still prints a different checksum on the two backends. The leak
oracle is the only one that catches unbounded growth in a program that is otherwise correct.
Memory safety is not a single property, so it does not get a single detector.
Findings: signatures, shrinking, and minimal repros
When an oracle fires, the driver does three useful things rather than merely shouting.
First, it reduces the failure to a signature — a short string such as vm-fault:runtime error:
array index out of bounds, or double-drop:type_id=3, or diff:VM-ne-native. Distinct
signatures are distinct findings; the same signature seen again is the same bug, reported once.
Second, for each new signature it shrinks the program to a minimal reproducer: it greedily
deletes operations — the + opK() terms in main make this trivial — for as long as the
signature still holds, and saves the result under tools/crucible-finds/. So you are never handed
a hundred-line generated program; you get the smallest one that still fails. Here is a real one, a
differential finding shrunk from its original two operations down to a single guilty loop:
// crucible seed=3 shape=3 ops=2 loops=30 — generated; do not edit.
import "std/map" as map
struct Inner { x: int }
struct S {
a: int
s: string
inner: Inner
}
fn c_pair<T>(a: T, b: T) -> [T] { return [a, b] }
fn c_keep<T>(move x: T) -> T { return x }
fn op1() -> int {
var acc = 0
var i = 0
loop {
if i == 30 { break }
let xs = c_pair(S { a: 65, s: "hello", inner: Inner { x: 31 } }, S { a: 5, s: "longerstring", inner: Inner { x: 18 } })
acc = acc + xs[0].a + xs[1].a
i = i + 1
}
return acc
}
fn main() -> int {
var total = 0
total = total + op1()
print("{total}")
return 0
}
Third, it prints one line per distinct finding — the signature, the seed, and the repro path:
── [NEW] [diff:VM-ne-native] seed=3 → tools/crucible-finds/find2_diff_VM_ne_native.ig (minimal: 1 op)
When the oracle that fired was the double-drop detector, the underlying evidence — written by the
-DEMBER_DROP_TRACE runtime — has this shape, and the sentinel 0x5EAD5EAD is the giveaway:
*** EMBER DOUBLE-DROP obj=<addr> type=<n> ***
STRUCT type_id=<n> field_count=<n>
first drop site: <addr>
It prints both drop sites — this one and the one it stashed the first time — because for a double-free the question is never “where was it freed?” but “where were the two places that each thought they owned it?”
The baseline: failing only on what’s new
A bug you have already filed shouldn’t paint every future run red. tools/crucible-known.txt is
the baseline: one signature per line, with # comments allowed. A signature listed there is
reported as [known] and does not fail the run; only a signature that is not listed counts as
NEW and flips the exit code to 1. The discipline, stated in the file’s own header, is to
remove a line only when the bug is actually fixed and a clean Crucible run confirms it. As of
this writing the file is empty: the two erased-generic double-free bugs that once lived there are
fixed on both backends, and Crucible is green.
Working a finding
When make crucible hands you a NEW finding, the loop is short:
- Open the minimal repro under
tools/crucible-finds/. It is a real.igfile, already as small as the tool could make it. -
Reproduce it by hand under the oracle that fired, so you can watch it happen. For a memory fault, reach for the tape — build the ASan + drop-trace compiler and run the repro under it:
make asan-trace ASAN_OPTIONS=detect_leaks=0 build/inglec-trace --emit=run tools/crucible-finds/find2_diff_VM_ne_native.igFor a
diff:VM-ne-native, run both backends and compare the answers directly:inglec --emit=run repro.ig # the reference answer (the VM) inglec -o /tmp/repro repro.ig && /tmp/repro # the native answer - Fix the bug, then add a regression test under
tests/so it stays fixed — a fix without a test that exercises it isn’t finished. - Confirm green. Remove the signature from
crucible-known.txtif it was baselined, and runmake crucibleagain until it reports0 NEW.
That is the same discipline the rest of Part IV asks for — reproduce on the smallest possible evidence, fix, lock it in with a test — applied to the one category of correctness a contract cannot describe: what the runtime does with your memory.
Fireside trivia. The double-drop detector’s sentinel is the 32-bit value
0x5EAD5EAD. Squint at it:5EAD5EADreads as “dead dead” — a value freed once is dead, and a value freed twice is dead twice. The pool never touches an object’s refcount field after reclaiming it, and a genuine re-allocation clears it, so if the runtime is about to free an object and finds0x5EAD5EADstill sitting in that field, it knows for certain it’s staring at the same corpse a second time. A good sentinel is a number that could never arise by accident and that tells you what it means the moment you read it in a debugger at two in the morning.