Advanced Usage & CLI Reference

Skip the TUI and work directly with the SDK CLI, pipeline stages, and container configuration.

Advanced Usage & CLI Reference

This section covers the internals of REE for power users, integrators, and anyone who wants to understand what happens under the hood or drive REE directly from the command line.

CLI Reference

While the TUI is the recommended interface, REE can also be driven directly via the gensyn-sdk CLI inside the container, or via the ree.sh shell script included in the repository.

This is useful for scripting, CI pipelines, or when you need fine-grained control over the pipeline.

Global Flags

--verbose is a global flag and must appear before the subcommand:

gensyn-sdk --verbose run ...
Flag
Description

--verbose

Enable debug-level logging.

Location Flags

Every command requires exactly one of the following (they are mutually exclusive):

Flag
Description

--tasks-root <path>

Root directory for tasks. The task directory is derived as <tasks-root>/<sanitized-model-name>. This is the recommended default.

--task-dir <path>

Use this exact directory for inputs and outputs. Use when you want to pin artifacts to a specific path.

circle-info

When using --tasks-root, the SDK auto-creates a subdirectory named after the model using path-safe characters. For example, --tasks-root /tmp/tasks with model Qwen/Qwen3-0.6B creates /tmp/tasks/Qwen--Qwen3-0.6B/.

Run (Subcommand)

Runs the full pipeline: [1] prepare, [2] generate, [3] receipt and [4] decode.

  1. Required Flags:

Flag
Description

--model-name / -m

Hugging Face model ID (e.g., Qwen/Qwen3-0.6B).

--prompt-text or --prompt-file

The prompt to run. Mutually exclusive; one is required.

circle-info

--operation-set is not a required flag. Instead, it defaults to reproducible mode. You would only use this flag if you wanted to switch to deterministic mode.

  1. Optional Flags:

Flag
Default
Description

--model-revision

main

Specific Hugging Face model revision.

--max-new-tokens

300

Maximum number of tokens to generate.

--cpu-only

false

Force CPU execution even if CUDA is available.

--force-model-export

false

Re-export ONNX model even if one exists.

--disable-kv-cache

false

Disable KV cache.

--short-circuit-length

Generation index at which to inject the short-circuit token.

--short-circuit-token

Token ID to inject when short-circuiting.

Validate (Subcommand)

Checks that a receipt is structurally valid by recomputing hashes and comparing them against the stored values. This does not re-run inference.

  1. Required flags:

Flag
Description

--receipt-path

Path to the receipt JSON file to validate.

circle-info

There aren't any location flags (--tasks-root / --task-dir) needed because validate only inspects the receipt file itself.

Verify (Subcommand)

Re-runs the full inference pipeline described in a receipt and compares the output against the receipt's claimed results. This is the strongest form of verification, as it is what proves the result is reproducible on your hardware.

  1. Required Flags:

Flag
Description

--receipt-path

Path to the receipt JSON file to verify.

--tasks-root or --task-dir

Where to store re-execution artifacts.

  1. Optional Flags:

Flag
Default
Description

--cpu-only

false

Force CPU execution during verification.

circle-info

verify needs both a receipt path (what to verify) and a location (where to put the re-run workspace). When using the TUI, --tasks-root is passed automatically.

Sampling Flags

These flags control the sampling behavior during generation.

In the TUI, they are passed via Extra Args. On the CLI, they are passed directly.

Flag
Default
Description

--do-sample / --no-do-sample

Enabled

Enable/disable stochastic sampling.

--temperature

1.0

Sampling temperature. Higher = more random.

--top-k

50

Top-k sampling cutoff.

--top-p

1.0

Nucleus sampling threshold.

--min-p

Disabled

Min-p sampling threshold.

--repetition-penalty

1.0

Repetition penalty multiplier.

Prompt Format (JSONL)

For CLI usage, prompts can be provided via --prompt-file using JSONL format. Each line must be either:

  • A JSON string: "What is 2 + 2?"

  • A JSON object with a prompt field: {"prompt": "Explain deterministic inference in one sentence."}

Here's an example prompts.jsonl file:

The Pipeline

Under the hood, REE's run command (whether triggered from the TUI or CLI) executes a four-stage pipeline: [1] prepare, [2] generate, [3] receipt, and [4] decode.

1. Prepare

Downloads the model from Hugging Face, exports it to ONNX format, tokenizes the prompt, and writes a task configuration file. All artifacts are written to the task directory.

Artifacts produced:

  • model/model.onnx: The exported ONNX model

  • model/tensors.binary: Serialized model weights

  • config.json: Task configuration (sampling settings, token limits, etc.)

  • prompt_tokens.parquet: Tokenized prompt

  • metadata/prepare.json: Prepare-stage metadata (model name, commit hash, config hash)

If model/model.onnx already exists in the task directory, prepare skips re-export and reuses it. Use --force-model-export to override this.

2. Generate

Loads the prepared ONNX model, compiles it through the Gensyn Compiler (applying RepOp kernels if --operation-set reproducible), and runs the inference loop.

Artifacts produced:

  • output_tokens.parquet: Generated token IDs

  • metadata/generate.json: Generate-stage metadata (finish reasons, device info, operation set, seed)

  • compiled-artifacts-*: Compiler output directories

3. Receipt

Assembles a cryptographically hashed receipt from the prepare and generate metadata, config, and output tokens.

Artifacts produced:

  • metadata/receipt_<timestamp>.json: Hashed receipt for full replication and verification.

4. Decode

Reads output_tokens.parquet, decodes the token IDs back into text using the model's tokenizer, and prints the result.

Persisting Data & Caching

The REE container mounts your host's ~/.cache directory into the container automatically. This persists both Hugging Face model downloads (~/.cache/huggingface) and SDK artifacts like ONNX exports, compiled models, and receipts (~/.cache/gensyn).

This means subsequent runs of the same model will skip the download and export steps automatically. No additional volume mounts or configuration are needed.

circle-info

When using the TUI, this caching is handled for you. The details above apply if you're running the container directly via the CLI.

EULA

Use of REE and its components (Gensyn SDK, Gensyn Compiler, RepOp kernels) is subject to the Gensyn End User License Agreement. Please review the EULA before use.arrow-up-right

Last updated