Advanced Usage & CLI Reference
Skip the TUI and work directly with the SDK CLI, pipeline stages, and container configuration.
Advanced Usage & CLI Reference
This section covers the internals of REE for power users, integrators, and anyone who wants to understand what happens under the hood or drive REE directly from the command line.
CLI Reference
While the TUI is the recommended interface, REE can also be driven directly via the gensyn-sdk CLI inside the container, or via the ree.sh shell script included in the repository.
This is useful for scripting, CI pipelines, or when you need fine-grained control over the pipeline.
Global Flags
--verbose is a global flag and must appear before the subcommand:
gensyn-sdk --verbose run ...--verbose
Enable debug-level logging.
Location Flags
Every command requires exactly one of the following (they are mutually exclusive):
--tasks-root <path>
Root directory for tasks. The task directory is derived as <tasks-root>/<sanitized-model-name>. This is the recommended default.
--task-dir <path>
Use this exact directory for inputs and outputs. Use when you want to pin artifacts to a specific path.
When using --tasks-root, the SDK auto-creates a subdirectory named after the model using path-safe characters. For example, --tasks-root /tmp/tasks with model Qwen/Qwen3-0.6B creates /tmp/tasks/Qwen--Qwen3-0.6B/.
Run (Subcommand)
Runs the full pipeline: [1] prepare, [2] generate, [3] receipt and [4] decode.
Required Flags:
--model-name / -m
Hugging Face model ID (e.g., Qwen/Qwen3-0.6B).
--prompt-text or --prompt-file
The prompt to run. Mutually exclusive; one is required.
--operation-set is not a required flag. Instead, it defaults to reproducible mode. You would only use this flag if you wanted to switch to deterministic mode.
Optional Flags:
--model-revision
main
Specific Hugging Face model revision.
--max-new-tokens
300
Maximum number of tokens to generate.
--cpu-only
false
Force CPU execution even if CUDA is available.
--force-model-export
false
Re-export ONNX model even if one exists.
--disable-kv-cache
false
Disable KV cache.
--short-circuit-length
—
Generation index at which to inject the short-circuit token.
--short-circuit-token
—
Token ID to inject when short-circuiting.
Validate (Subcommand)
Checks that a receipt is structurally valid by recomputing hashes and comparing them against the stored values. This does not re-run inference.
Required flags:
--receipt-path
Path to the receipt JSON file to validate.
There aren't any location flags (--tasks-root / --task-dir) needed because validate only inspects the receipt file itself.
Verify (Subcommand)
Re-runs the full inference pipeline described in a receipt and compares the output against the receipt's claimed results. This is the strongest form of verification, as it is what proves the result is reproducible on your hardware.
Required Flags:
--receipt-path
Path to the receipt JSON file to verify.
--tasks-root or --task-dir
Where to store re-execution artifacts.
Optional Flags:
--cpu-only
false
Force CPU execution during verification.
verify needs both a receipt path (what to verify) and a location (where to put the re-run workspace). When using the TUI, --tasks-root is passed automatically.
Sampling Flags
These flags control the sampling behavior during generation.
In the TUI, they are passed via Extra Args. On the CLI, they are passed directly.
--do-sample / --no-do-sample
Enabled
Enable/disable stochastic sampling.
--temperature
1.0
Sampling temperature. Higher = more random.
--top-k
50
Top-k sampling cutoff.
--top-p
1.0
Nucleus sampling threshold.
--min-p
Disabled
Min-p sampling threshold.
--repetition-penalty
1.0
Repetition penalty multiplier.
Prompt Format (JSONL)
For CLI usage, prompts can be provided via --prompt-file using JSONL format. Each line must be either:
A JSON string:
"What is 2 + 2?"A JSON object with a
promptfield:{"prompt": "Explain deterministic inference in one sentence."}
Here's an example prompts.jsonl file:
The Pipeline
Under the hood, REE's run command (whether triggered from the TUI or CLI) executes a four-stage pipeline: [1] prepare, [2] generate, [3] receipt, and [4] decode.
1. Prepare
Downloads the model from Hugging Face, exports it to ONNX format, tokenizes the prompt, and writes a task configuration file. All artifacts are written to the task directory.
Artifacts produced:
model/model.onnx: The exported ONNX modelmodel/tensors.binary: Serialized model weightsconfig.json: Task configuration (sampling settings, token limits, etc.)prompt_tokens.parquet: Tokenized promptmetadata/prepare.json: Prepare-stage metadata (model name, commit hash, config hash)
If model/model.onnx already exists in the task directory, prepare skips re-export and reuses it. Use --force-model-export to override this.
2. Generate
Loads the prepared ONNX model, compiles it through the Gensyn Compiler (applying RepOp kernels if --operation-set reproducible), and runs the inference loop.
Artifacts produced:
output_tokens.parquet: Generated token IDsmetadata/generate.json: Generate-stage metadata (finish reasons, device info, operation set, seed)compiled-artifacts-*: Compiler output directories
3. Receipt
Assembles a cryptographically hashed receipt from the prepare and generate metadata, config, and output tokens.
Artifacts produced:
metadata/receipt_<timestamp>.json: Hashed receipt for full replication and verification.
4. Decode
Reads output_tokens.parquet, decodes the token IDs back into text using the model's tokenizer, and prints the result.
Persisting Data & Caching
The REE container mounts your host's ~/.cache directory into the container automatically. This persists both Hugging Face model downloads (~/.cache/huggingface) and SDK artifacts like ONNX exports, compiled models, and receipts (~/.cache/gensyn).
This means subsequent runs of the same model will skip the download and export steps automatically. No additional volume mounts or configuration are needed.
When using the TUI, this caching is handled for you. The details above apply if you're running the container directly via the CLI.
EULA
Use of REE and its components (Gensyn SDK, Gensyn Compiler, RepOp kernels) is subject to the Gensyn End User License Agreement. Please review the EULA before use.
Last updated