Getting started

Installation

Runtime requirements: qemu-system-x86_64, bsdtar, curl. Docker is optional — used for Postgres and Redis service layers.

Install

The installer auto-detects your platform, downloads the pre-built binary, verifies the checksum, and falls back to a source build if no binary is available for your platform.

curl -fsSL kepr.uk/kiln/install.sh | sh

Manual build (Zig 0.16.0 required)

curl -fsSL kepr.uk/kiln/releases/latest/source.tar.gz | tar xz
cd kiln
zig build -Doptimize=ReleaseSafe
cp zig-out/bin/kiln /usr/local/bin/kiln

Verify

kiln --version   # → 5.0.0
kiln env         # checks qemu, docker, bsdtar, curl and image cache

Basics

Quick start

Run once

cd ~/projects/myapp
kiln run

Detects project type, runs the appropriate test command, prints pass/fail. Exits 0 if all pass.

Watch until clean

kiln watch

Loops continuously. Fingerprints each failure. Routes environment-dependent ones to a chamber. Writes the rest to kiln/pending/. Exits when clean or after 100 loops.

Full pressure campaign

kiln break

Maps the attack surface, generates 83 deterministic scenarios across 6 layers, runs them inside a QEMU chamber, and writes a findings report.

NixOS project from macOS

kiln chamber fetch nixos-24.11
kiln chamber run

Downloads the NixOS image once (~800 MB, cached permanently), then runs your test suite inside a real VM.

Auto-configuration

Project detection

Kiln detects the project type from marker files in priority order.

Signal	Detected type	Test command
`Cargo.toml` + `src/`	Rust	`cargo nextest run --no-fail-fast --message-format libtest-json-plus`
`go.mod`	Go	`go test ./... -json`
`mix.exs`	Elixir	`mix test`
`build.zig` or any `.zig` in root	Zig	`zig build test`
`pyproject.toml` or `setup.py`	Python	`pytest --tb=short`
`package.json`	Node	from `scripts.test`; prefers jest/vitest `--json`
`build.gradle` / `pom.xml`	JVM	`./gradlew test` or `mvn test -q`
`Makefile` with `test` target	Make	`make test`

Environment hint scanning

Kiln walks up to 256 test files to detect service dependencies:

Keyword pattern	Hint	Chamber effect
`psycopg2`, `sqlalchemy`, `lib/pq`	RequiresPostgres	Docker: postgres:16-alpine
`redis`, `Redix`, `go-redis`	RequiresRedis	Docker: redis:7-alpine
`inotify`, `epoll`, `/proc/`	RequiresLinuxKernel	QEMU chamber
`systemctl`, `systemd`	RequiresSystemd	QEMU chamber
`playwright`, `puppeteer`	RequiresBrowser	QEMU chamber
`http.Get`, `requests.get`	RequiresNetwork	Note in `kiln/pending/`

Commands

kiln run

Run the test suite once. Detects project type, runs the test command, parses output, prints summary.

Basic usage

kiln run                         # current working directory
kiln run ~/projects/nina         # explicit path
kiln run --no-chamber            # skip chamber routing
kiln run --json                  # machine-readable NDJSON output

With harness

Run the Rust kiln-harness adjacent to the project root. Requires KILN_TARGET_BIN or KOH_BIN to be set.

KILN_TARGET_BIN=~/projects/koh/zig-out/bin/koh kiln run --harness

Exit codes

0 — all tests passed. 1 — one or more failures. 2 — safety violation (write outside kiln/).

Commands

kiln watch

Run the suite in a continuous loop until clean or max_loops (default 100) is reached.

kiln watch
kiln watch ~/projects/nina
kiln watch --no-chamber          # skip all chamber routing

What the loop does

Run suite — calls the detected test command

Fingerprint — Blake3 over test name + first non-blank stderr + source location + binary. Same failure = same fingerprint every run.

Classify — sorts each new failure into a class

Route — EnvDependent failures go to a chamber; chamber resolves → chamber_resolved++; else write to kiln/pending/

Loop — if no new failures, write smelt report and exit

Failure classes

Class	Triggers on
`EnvDependent`	`inotify`, `systemd`, `/proc/`, `ENOENT /dev/`, `socket permission denied`
`Timeout`	`SIGALRM`, `test timed out`
`CompileError`	`error[E`, `syntax error`, `undefined reference`
`PanicAssertion`	`thread panic`, `unreachable`, `index out of bounds`, `TestUnexpectedResult`
`ResourceExhaust`	`OOM`, `too many open files`, `ENOMEM`, `EMFILE`
`Unknown`	Anything else

Pending files

Each unresolved failure is written to kiln/pending/<slug>-<fp8>.md:

## Failure
test:        my_project::tests::parse_empty_input
class:       PanicAssertion
loops:       3
fingerprint: a4f9c2d44b81e3cc...

## What happened
thread 'main' panicked at 'index out of bounds: the len is 0'
src/parser.rs:142

## Source location
src/parser.rs:142

Commands

kiln break

Full adversarial pressure campaign. Maps the attack surface, generates a deterministic pressure program, runs it inside a QEMU chamber, and writes a findings report to kiln/reports/break-<timestamp>.md.

Basic usage

kiln break                          # full campaign, chamber mode
kiln break --no-chamber             # run on host (faster, less isolated)
kiln break --seed 9a3f4c            # deterministic — same seed, same program
kiln break --iterations 50000       # override max_iterations (default 100 000)
kiln break --concurrency 32         # override concurrency_max (default 64)
kiln break --time 2h                # run for a fixed duration instead

Focus a single pressure layer

kiln break --focus boundary         # boundary value inputs only
kiln break --focus concurrent       # concurrency stress only
kiln break --focus resource         # resource limits only
kiln break --focus longitudinal     # memory leak detection only
kiln break --focus fault            # fault injection only
kiln break --focus chaos            # combined chaos scenarios
kiln break --focus adversarial      # NixOS state corruption (requires chamber)

Replay a specific finding

kiln break --replay boundary-9a3f4c00 --seed 9a3f4c

Exploration

After the standard run, follow up on Critical and High findings with targeted variations.

kiln break --explore                # standard run + exploration phase
kiln break --explore-only           # exploration only, reads last run's findings
kiln break --explore-only --seed 9a3f4c  # explore a specific past run

Pressure layers

Boundary

Exercises all boundary values for each input point. Integers: 0, -1, MAX_INT, MIN_INT. Strings: empty, path traversal, null bytes, shell injection. Paths: /, /dev/null, nonexistent. A crash is a Medium finding.

Concurrency

Spawns concurrency_max (default 64) instances simultaneously. Any non-zero exit is a High finding.

Resource

Runs under 6 constraints: disk full (4 KB free), memory limit (32 MB), FD limit (32), network timeout (100 ms), clock +24 h, clock −24 h.

Longitudinal

Runs longitudinal_runs times (default 10 000). RSS doubling is a High finding.

Fault

Four scenarios: kill_after_create, corrupt_zero, corrupt_truncate, disk_full_during_write. Inside a chamber these use real kernel disk operations.

Chaos

Five combinations: boundary+concurrency, disk_full+write, kill+midtxn, fd_limit+concurrency, boundary+longitudinal.

Finding structure

Each finding in kiln/reports/break-<timestamp>.md contains:

id:              boundary-9a3f4c00
layer:           boundary
trigger:         path_traversal_input
observed:        process exited 0 with no error; wrote file outside sandbox
expected:        non-zero exit or rejection of traversal path
reproducibility: Yes
severity:        Critical
consequence:     attacker can write arbitrary files by passing ../../../etc/passwd as input
replay:          kiln break --replay boundary-9a3f4c00 --seed 9a3f4c

Note

The report writer hard-rejects Critical and High findings with no consequence field. Severity is not advisory — it gates the finding.

Commands

Adversarial state testing

16 NixOS-specific corruption scenarios. Each runs against a clean QEMU snapshot via savevm/loadvm — destructive scenarios do not contaminate each other. Requires chamber mode.

kiln break --focus adversarial
kiln break --focus adversarial --scenario corrupt_package_hash
kiln break --focus adversarial --count 32    # concurrent instance count

Scenario reference

Scenario	What it does	Acceptable response
`corrupt_package_hash`	Flip bytes in a Nix store package	Clear error, no crash, no data corruption
`corrupt_store_db`	Truncate `/nix/var/nix/db/db.sqlite`	Clear error with recovery suggestion
`missing_store_path`	Remove a store path a package depends on	Clear dependency error, no crash
`broken_generation_link`	Symlink a generation to a nonexistent target	Clear error, no silent corruption
`orphan_generation`	Generations not in the current profile chain	`kiln clean` removes them, app continues
`inconsistent_gen_numbers`	Generation numbers out of sequence	Clear error or silent self-repair
`concurrent_writes`	Two instances writing config simultaneously	One wins cleanly, other fails with clear error
`concurrent_rebuild`	Two rebuilds running simultaneously	Lock prevents second, clear error
`concurrent_install_remove`	Install and remove the same package at the same time	One wins, other fails cleanly
`kill_during_rebuild`	SIGKILL sent mid-rebuild	Clean state or clear "interrupted" error on next run
`kill_during_install`	SIGKILL during `nix-env --install`	Partial install cleaned up on next run
`disk_full_mid_write`	Fill disk after lock acquired, before commit	Transaction rolled back, no partial state
`network_drop_mid_fetch`	Cut network during `nix store fetch`	Retries or clear error, no corrupt store path
`invalid_nix_syntax`	Introduce syntax error into `configuration.nix`	Nix eval error surfaced clearly
`undefined_option`	Reference a nonexistent NixOS option	Nix type error surfaced clearly
`circular_import`	Create a circular module import	Nix infinite recursion caught, clear error

Priority 1 scenarios

Highest signal-to-noise: corrupt_package_hash, kill_during_rebuild, disk_full_mid_write, concurrent_writes, broken_generation_link, network_drop_mid_fetch.

Commands

Temporal simulation

Runs scripted multi-cycle workflows inside a chamber. Finds what accumulates, drifts, or degrades silently over many cycles.

kiln temporal                                   # all built-in workflows
kiln temporal --workflow install-remove-50      # specific built-in
kiln temporal --workflow rebuild-rollback-20
kiln temporal --workflow kiln/workflows/my-workflow.toml  # custom
kiln temporal --cycles 200                      # override cycle count

Built-in workflows

Name	What it does
`install-remove-50`	Install a package, verify, remove, verify — 50 times. Watches for store growth, generation leaks, output drift.
`rebuild-rollback-20`	Apply config, rebuild, verify, roll back — 20 times. Watches for performance cliff, step failures.

What temporal detects

Finding	Description
`store_growth`	Nix store grows unexpectedly across cycles
`generation_leak`	Generation count grows without clean
`output_drift`	Same command produces different output on cycle N vs cycle 1
`performance_cliff`	Elapsed time for a step increases significantly
`step_failure`	A step that passed on cycle 1 fails on cycle N

Custom workflow

# kiln/workflows/my-workflow.toml
name = "my-install-cycle"
description = "custom install/remove workflow"
cycles = 100

[[steps]]
cmd = "$BIN install ripgrep"
description = "install ripgrep"
assert_exit = 0

[[steps]]
cmd = "rg --version"
description = "verify ripgrep works"
assert_exit = 0
sample_state = true

[[steps]]
cmd = "$BIN remove ripgrep"
description = "remove ripgrep"
assert_exit = 0

Commands

Visual verification

Captures a screenshot from a running chamber and pixel-diffs it against a stored baseline. Requires a chamber image with grim installed, booted with --display.

kiln visual ~/projects/opal              # compare against baseline
kiln visual --update-baseline            # capture and set new baseline
kiln visual --spec kiln/visual/panel.toml  # explicit spec file

Workflow

# First run: establish the baseline
kiln visual ~/projects/opal --update-baseline
# → captures screenshot, writes kiln/visual/<name>-baseline.png

# Subsequent runs: compare against baseline
kiln visual ~/projects/opal
# → PASS if diff < threshold, FAIL (finding) if diff >= threshold

# After a deliberate visual change that is correct:
kiln visual ~/projects/opal --update-baseline

Visual spec file

# kiln/visual/panel-spec.toml
name = "opal-panel-main"
description = "Main Opal panel: dark background, task list visible, clock in top right"
baseline_path = "kiln/visual/opal-panel-baseline.png"
diff_threshold = 0.005   # 0.5% pixel difference allowed

Implementation

PNG comparison is implemented in pure Zig. No external tools required beyond grim for capture. The diff fraction is the proportion of pixels that differ above a per-channel tolerance.

Commands

Chamber system

A chamber is a disposable QEMU VM with a copy-on-write overlay. The base image is never modified. When the chamber is dropped, the overlay is deleted — the next run starts from a byte-identical base.

List and run

kiln chamber                         # list active chambers
kiln chamber run                     # boot VM, run tests, return exit code
kiln chamber run ~/projects/nina
kiln chamber run --os nixos-24.11    # named image
kiln chamber run --os ubuntu-24.04
kiln chamber run --os /path/to/my.qcow2   # custom image path
kiln chamber run --keep              # keep VM alive after completion
kiln chamber run --display           # enable VNC display (for screenshots)

Image management

kiln chamber images                  # list all 15 supported images and cache status
kiln chamber fetch nixos-24.11       # download and cache
kiln chamber fetch ubuntu-24.04
kiln chamber fetch --all             # download all 15 images

Supported images

Name	OS	Size
`nixos-24.11`	NixOS 24.11	800 MB
`nixos-unstable`	NixOS unstable	800 MB
`ubuntu-22.04`	Ubuntu 22.04	600 MB
`ubuntu-24.04`	Ubuntu 24.04	620 MB
`ubuntu-26.04`	Ubuntu 26.04	640 MB
`debian-12`	Debian 12	300 MB
`debian-13`	Debian 13	320 MB
`fedora-41`	Fedora 41	400 MB
`almalinux-9`	AlmaLinux 9	1.1 GB
`rocky-9`	Rocky Linux 9	1.1 GB
`alpine-3.19`	Alpine 3.19	50 MB
`alpine-3.20`	Alpine 3.20	55 MB
`arch`	Arch (current)	800 MB
`opensuse-15.6`	openSUSE 15.6	800 MB
`kali`	Kali (current)	1.4 GB

Interactive access

kiln chamber run --keep              # keep VM alive
kiln chamber                         # shows id and ssh port
kiln chamber open <id>               # drop into an interactive shell

Screenshots

kiln chamber run --display --keep
kiln chamber screenshot <id>
kiln chamber screenshot <id> --out kiln/screenshots/baseline.png

Teardown

kiln chamber drop <id>
kiln chamber drop --all

QEMU accelerator

Selected automatically: hvf on macOS, kvm on Linux, tcg as fallback. SSH wait timeout is 600 s on TCG backend.

Commands

Fleet

Provision and manage multiple chambers simultaneously. Used for parallel test operations. Default base port: 52200. SSH at base_port + slot*2, serial at base_port + slot*2 + 1.

Full fleet workflow

# Provision 12 chambers, all from the same NixOS image
kiln fleet provision --count 12 --id nina-ops --os nixos-24.11

# If a slot failed to boot, retry just that slot
kiln fleet provision --slot 3 --id nina-ops

# Inject config files into all slots
kiln fleet inject nina-ops --file kiln/fleet/nina.conf --dest /root/.nina.conf

# Inject into a specific slot only
kiln fleet inject nina-ops --slot 6 --file kiln/fleet/test_key --dest /home/testuser/.ssh/test_key

# Snapshot clean state — restore to this point between test runs
kiln fleet snapshot nina-ops --label clean-baseline
kiln fleet restore nina-ops --label clean-baseline

# Deploy a binary to all slots
kiln fleet deploy nina-ops --bin ~/projects/nina/zig-out/bin/nina

# Run a script on all slots
kiln fleet run nina-ops --cmd '/root/run-tests.sh'

# Run a script on one slot only
kiln fleet run nina-ops --slot 0 --cmd '/root/SA-01.sh'

# Collect results
kiln fleet collect nina-ops --remote /tmp/results.jsonl --local kiln/fleet/results/

# Status and teardown
kiln fleet status nina-ops
kiln fleet drop nina-ops

Commands

Housekeeping

Command	Description
`kiln ash`	List `kiln/pending/` failures needing attention (alias: `kiln pending`)
`kiln pending`	Alias for `ash`
`kiln report`	Print the most recent report from `kiln/reports/`
`kiln init [path]`	Write `kiln/kiln.toml` if it does not exist
`kiln env`	Check toolchains and image cache status
`kiln config [path]`	Print resolved configuration
`kiln clean [path]`	Prune stale chamber records and session manifests

Reference

Configuration

kiln init writes a starter config to kiln/kiln.toml. All fields have defaults — the file is optional.

[runner]
threads = 8
retry_count = 2
timeout_per_test_secs = 30
max_loops = 100

[chamber]
enabled = true
cache_dir = ""           # empty → $TMPDIR/kiln-chambers
image = ""               # empty → nixos-24.11 (auto-downloaded on first use)
keep_on_fail = false
allow_network = false
memory_mb = 2048
cpus = 2

[break]
enabled = true
max_iterations = 100000
concurrency_max = 64
longitudinal_runs = 10000
chaos_combinations = true
seed = 0                 # 0 → random seed per run

[fleet]
base_port = 52200
max_slots = 16
poll_interval_s = 30
timeout_s = 14400

[report]
write_markdown = true
output_dir = "kiln/reports"

Image resolution order

--os <name> flag

KILN_CHAMBER_IMAGE environment variable

image key in kiln/kiln.toml [chamber] section

Default: nixos-24.11 (auto-downloaded on first use)

Binary discovery order

Used by kiln break and kiln run --harness:

Project detection — scans standard output paths (zig-out/bin/, target/release/, etc.)

KILN_TARGET_BIN environment variable

KOH_BIN environment variable (legacy alias)

Auto-build + re-discovery (runs zig build, cargo build --release, etc.)

CLI flags reference

Flag	Effect
`--json`	Newline-delimited JSON output
`--no-chamber`	Skip chamber routing in `watch`; run on host in `break`
`--keep`	Keep chamber alive after `chamber run`
`--display`	Enable VNC display on chamber (for screenshots)
`--focus <layer>`	Restrict `break` to one pressure layer
`--scenario <name>`	Run one adversarial scenario
`--seed <n>`	Override break seed for deterministic replay
`--iterations <n>`	Override `break.max_iterations`
`--concurrency <n>`	Override `break.concurrency_max`
`--time <n>h`	Time limit for break
`--replay <id>`	Replay a specific finding
`--explore`	Add exploration phase after break
`--explore-only`	Exploration only, reads findings from last run
`--workflow <name>`	Temporal workflow name or TOML path
`--cycles <n>`	Override temporal cycle count
`--spec <toml>`	Visual spec file path
`--update-baseline`	Capture and set new visual baseline
`--harness`	Run kiln-harness (with `run`)
`--os <name>`	Chamber image name, path, or URL
`--path <p>`	Explicit project root
`--slot <n>`	Target a specific fleet slot

Reference

Output

Human output symbols

✦OK / success

✗Error / failure

⚠Warning

▸Status / in-progress

All output goes to stdout. Safety violation messages go to stderr.

JSON mode

With --json or in CI ($CI set), Kiln emits newline-delimited JSON:

{"event":"run_started","project":"/home/user/myapp","language":"zig"}
{"event":"test_passed","name":"parse_empty","elapsed_ms":12}
{"event":"test_failed","name":"parse_overflow","class":"PanicAssertion"}
{"event":"chamber_done","id":"c4a9f","exit":0}
{"event":"finding","id":"boundary-9a3f4c00","severity":"Critical","layer":"boundary"}
{"event":"break_complete","findings":3,"seed":"9a3f4c"}
{"event":"run_complete","passed":47,"failed":0,"elapsed_ms":1204}

Reference

Environment variables

Variable	Effect
`KILN_TARGET_BIN`	Path to target binary for `break` and `run --harness`. Overrides `KOH_BIN`.
`KOH_BIN`	Legacy alias for `KILN_TARGET_BIN`.
`KILN_CHAMBER_IMAGE`	Path, named image, or URL for the chamber base image. Overrides `image` in `kiln.toml`.
`KILN_CHAMBER_CACHE`	Directory for cached images and chamber overlays. Default: `$TMPDIR/kiln-chambers`.
`KILN_BREAK_SEED`	Deterministic seed for break runs. Overrides `break.seed` in `kiln.toml`.
`KILN_THREADS`	Override `runner.threads`.
`KILN_DEBUG`	Verbose output — shows QEMU argv, SSH commands, serial console traffic.
`CI`	Auto-enables JSON output when set.

Reference

Exit codes

Clean. All tests passed, or watch loop found no new failures.

Failures. Test failures, blocked items, break findings, or command-line error.

Safety violation. Kiln attempted to write outside kiln/, or the source tree was modified during a run. Never used for anything else. SAFETY_VIOLATION_EXIT = 2 is the only caller of std.process.exit(2) in the codebase.

Reference

Safety model

Write path enforcement

SafetyGuard.assertWritePermitted(path) is called before every write operation. It resolves the path and verifies it is inside kiln/. Any write outside kiln/ exits with code 2 immediately.

Source tree auditing

Before every run, Kiln snapshots every non-ignored file's mtime and size. After the run, it compares. Any modification triggers exit code 2.

Excluded from snapshot: kiln/, .git/, .koh/, zig-out/, zig-cache/, node_modules/, target/, .venv/, __pycache__/.

Blast radius

All fault injection runs inside a sandbox — a tmpdir on the host for --no-chamber mode, or the VM filesystem for chamber mode. The FaultInjector resolves every target path and rejects anything outside the sandbox with error.BlastRadiusViolation. Process kills are rejected for any PID not registered as owned by Kiln.

Subprocess environment

Test subprocesses run with env_map = null, stripping the inherited environment. No API keys, no credentials, no host-specific variables reach the test runner.

Reference

File layout

Kiln writes only inside kiln/. The source tree is never touched.

kiln/
├── kiln.toml              # configuration
├── pending/               # unresolved failures from watch
│   └── <slug>-<fp8>.md
├── reports/               # run reports
│   ├── watch-<ts>.md
│   ├── break-<ts>.md
│   └── temporal-<ts>.md
├── findings/              # persisted break findings (for --explore-only)
│   └── <seed-hex>.jsonl
├── chambers/              # active chamber registry
│   └── <id>.json
├── sessions/              # session manifests for cleanup
│   └── <pid>.json
├── screenshots/           # captured screenshots
│   └── <id>-<ts>.png
├── visual/                # visual verification baselines and specs
│   ├── <name>-baseline.png
│   └── panel-spec.toml
├── workflows/             # custom temporal workflows
│   └── my-workflow.toml
└── fleet/                 # fleet registry and results
    ├── <fleet-id>.json
    └── results/
        └── slot-<n>-iter-<k>.jsonl