PROJECT · TOK0 / FIELD COMPRESSION UNIT
SERIAL NO. 0.1.1 · BUILT IN RUST · MIT
OPEN SOURCE — SHELL OUTPUT COMPRESSION PROXY TOK0 · DOCS >>> CONCEPTS

Compression pipeline

tok0 runs every command's output through eight pure stages. Same input, same output, every time. Here's what each stage does and where to tune it.

Every byte that flows through tok0 passes through the same eight-stage pipeline. Each stage is a pure function of its input — no side effects, no shared state, no hidden behavior. That’s what lets us treat the whole pipeline as a deterministic spec, snapshot-test it in CI, and assert minimum savings on every release.

The eight stages

        raw stdout

   ┌────────▼────────┐
   │ 1. strip_ansi   │   remove colors / cursor moves
   ├─────────────────┤
   │ 2. pattern      │   user-defined search/replace
   │    replace      │
   ├─────────────────┤
   │ 3. output match │   keep only matching blocks
   ├─────────────────┤
   │ 4. line select  │   include / exclude per regex
   ├─────────────────┤
   │ 5. truncate     │   per-line max chars
   ├─────────────────┤
   │ 6. head/tail    │   keep N first + N last
   │    window       │
   ├─────────────────┤
   │ 7. hard cap     │   absolute byte ceiling
   ├─────────────────┤
   │ 8. empty        │   "ok" if nothing left
   │    fallback     │
   └────────┬────────┘

       compressed text

1 · strip_ansi

Removes every ANSI escape sequence — colors, cursor moves, hyperlink wrappers. ~5–15% savings on its own for any TTY-aware tool (cargo, npm, pip, docker).

2 · Pattern replace

User- or rule-defined [(regex, replacement)] pairs applied in order. This is where you collapse repeated banners, normalize timestamps, or strip checksums.

[[replacements]]
pattern = "^Compiling [a-z0-9_-]+ v[0-9.]+ \\(.+\\)$"
replacement = ""

3 · Output match

If output_match patterns are defined, only blocks matching at least one are kept. Useful for tools where 95% of output is throat-clearing and only the trailing summary matters (e.g. pytest -v).

4 · Line select

Per-line include / exclude regexes. Excludes win over includes on conflict. Lines are matched against trimmed content.

5 · Truncate

Caps each surviving line at max_line_chars. The remainder is replaced with . Default 240. Anything wider is almost certainly wrapped output meant for human eyes, not the model.

6 · Head/tail window

Keeps the first head and last tail lines, replacing the middle with a one-line marker like … 1,847 lines elided …. The default of head/tail = 30/15 keeps the orientation and the verdict, drops the middle.

7 · Hard cap

Absolute byte ceiling. If the pipeline output is still over max_chars, it’s truncated and a marker line is appended. This is the safety valve — no compressor can ever produce output larger than max_chars, by construction.

8 · Empty fallback

If the previous seven stages collapsed everything down to whitespace, the rule’s empty_message is emitted instead. Default: ok. This is what makes silent successful commands cost almost nothing.

Worked example

Raw brew install foo:

==> Downloading https://ghcr.io/v2/homebrew/core/foo/manifests/1.2.3
######################################################################## 100.0%
==> Pouring foo--1.2.3.arm64_sonoma.bottle.tar.gz
🍺  /opt/homebrew/Cellar/foo/1.2.3: 14 files, 2.1MB

After the brew-install rule:

ok

Savings: 94%. The brew-install.toml rule strips the ==> lines, the percentage bar, and the cellar receipt — the only surviving signal is success, so the empty fallback fires.

Tip

You can dry-run the pipeline against any captured output with tok0 profile run <cmd> — it prints the raw and compressed sizes plus which stages eliminated what.

Determinism guarantees

  • No clock reads, no random numbers, no environment lookups inside the pipeline.
  • Regexes are compiled once via lazy_static!. Same pattern set ⇒ same behavior across runs.
  • Snapshot tests pin the byte-exact output of every rule against a real captured fixture.

If you find an input that produces non-deterministic output, that is a bug — file an issue.

Where to tune it

Most users never write a single line of pipeline config — the built-in TOML rules cover ~250 commands out of the box. When you do need to override:

  • Project-local: drop a TOML rule in .tok0/filters/ (requires explicit trust — see Trust & safety).
  • Global: drop a TOML rule in ~/.config/tok0/filters/.
  • Built-in: PR a src/rules/<cmd>.toml into the repo. CI will assert your savings floor.

See Writing TOML rules for the full schema.

BUILT IN RUST · SINGLE STATIC BINARY · 8 MB v0.1.1 / MIT GITHUB.COM/PRXM-LABS/TOK0