Getting Started¶
Prerequisites¶
For binary use:
- No language toolchain required.
For source builds:
- Rust stable toolchain.
Option A: Run Release Binary¶
- Download your OS artifacts from GitHub Releases.
- Install using a native package (
.msi,.deb,.rpm,.pkg) or extract archive (.zip,.tar.gz). - Run a smoke check with
wraithrun --help. - Run a dry-run investigation task.
Windows (MSI):
msiexec /i .\wraithrun-windows-x86_64.msi /qn
wraithrun --help
Windows (ZIP):
.\wraithrun.exe --task "Investigate unauthorized SSH keys"
Linux (DEB/RPM):
sudo dpkg -i ./wraithrun-linux-x86_64.deb
wraithrun --help
sudo dnf install ./wraithrun-linux-x86_64.rpm
wraithrun --help
Linux/macOS (tar.gz):
./wraithrun --task "Investigate unauthorized SSH keys"
macOS (PKG):
sudo installer -pkg ./wraithrun-macos-x86_64.pkg -target /
/usr/local/bin/wraithrun --help
Option B: Run From Source¶
git clone https://github.com/Shreyas582/WraithRun.git
cd WraithRun
cargo run -p wraithrun -- --task "Investigate unauthorized SSH keys"
Template-based run:
cargo run -p wraithrun -- --task-template listener-risk
Task file run:
cargo run -p wraithrun -- --task-file .\launch-assets\incident-task.txt --format summary
Task stdin run:
Get-Content .\launch-assets\incident-task.txt | cargo run -p wraithrun -- --task-stdin --format summary
Template-based run with target overrides:
cargo run -p wraithrun -- --task-template syslog-summary --template-target C:/Logs/security.log --template-lines 50
Live Inference Mode (Optional)¶
Live inference requires:
- A compatible ONNX model.
- A matching tokenizer.json.
- ONNX Runtime (bundled or via
ORT_DYLIB_PATH).
Two feature flags are available for source builds:
inference_bridge/onnx: CPU execution provider (works on any platform with ONNX Runtime).inference_bridge/vitis: AMD RyzenAI Vitis execution provider (requires RyzenAI SDK).
Feature check:
cargo check -p inference_bridge --features onnx
cargo check -p inference_bridge --features vitis
Live run (CPU):
cargo run -p wraithrun --features inference_bridge/onnx -- --live --model C:/models/llm.onnx --tokenizer C:/models/tokenizer.json --task "Investigate unauthorized SSH keys"
Live run (RyzenAI NPU):
cargo run -p wraithrun --features inference_bridge/vitis -- --live --model C:/models/llm.onnx --tokenizer C:/models/tokenizer.json --task "Investigate unauthorized SSH keys"
One-command live setup bootstrap (validates model compatibility before writing config):
cargo run -p wraithrun -- live setup --model C:/models/llm.onnx --config .\wraithrun.toml
Model-pack lifecycle checks:
cargo run -p wraithrun -- models list --introspection-format json
cargo run -p wraithrun -- models validate --introspection-format json
cargo run -p wraithrun -- models benchmark --introspection-format json
Model Capability Tiering¶
When running in live mode, WraithRun automatically probes the loaded model to classify it into a capability tier:
- Basic: small models (≤2B params or ≥200ms latency). Agent uses template-driven tool execution and a deterministic structured summary (no LLM synthesis).
- Moderate: medium models. Agent uses a ReAct (Reason + Act) loop, iteratively choosing tools based on observations, then synthesizes findings via LLM.
- Strong: large models (≥10B params and ≤50ms latency). Agent uses a full ReAct loop with the complete evidence window for deep iterative reasoning and synthesis.
Since v1.8.0, parameter estimation is quantization-aware: Q4 models use 0.55 bytes/param, Q8 uses 1.1, FP16 uses 2.2, FP32 uses 4.4. This means Q4 models are classified more accurately — a 750 MB Q4 file now correctly estimates ~1.4B parameters instead of ~0.3B.
Override automatic classification when you know your model's capability:
cargo run -p wraithrun -- --task "Investigate unauthorized SSH keys" --live --model C:/models/llm.onnx --tokenizer C:/models/tokenizer.json --capability-override strong
Output Format¶
WraithRun prints a JSON report with:
- contract_version: machine-readable contract version marker.
- task: your original request.
- max_severity: highest severity level across all findings (when findings are present).
- model_capability: capability tier, estimated parameters, execution provider, latency, and vocab size (live mode).
- findings: normalized actionable findings (deduplicated, sorted by severity). Each finding includes a discrete
confidence_labelandrelevancetag. - supplementary_findings: lower-relevance findings from non-primary tools (compact mode only).
- run_timing: optional latency fields (
first_token_latency_ms,total_run_duration_ms). - live_run_metrics: optional live reliability/latency fields for live-mode runs.
- turns: intermediate reasoning and tool observations (included when
--output-mode fullis used). - final_answer: final response text.
By default, output uses compact mode which omits the turns array to reduce payload size. Use --output-mode full to include all intermediate reasoning steps.
Configuration and Profiles¶
WraithRun supports config-driven runs through TOML files and named profiles.
- Auto-loads
./wraithrun.tomlwhen present. - Explicit file path via
--configorWRAITHRUN_CONFIG. - Profile selection via
--profileorWRAITHRUN_PROFILE.
Built-in profile names:
local-labproduction-triagelive-modellive-fastlive-balancedlive-deep
Example:
cargo run -p wraithrun -- --task "Check suspicious listener ports" --config .\wraithrun.example.toml --profile production-triage
Quick diagnostics:
cargo run -p wraithrun -- --doctor
Quick diagnostics as JSON:
cargo run -p wraithrun -- --doctor --introspection-format json
List profiles:
cargo run -p wraithrun -- --list-profiles
Preview effective config:
cargo run -p wraithrun -- --print-effective-config --profile local-lab
Explain source of each resolved value:
cargo run -p wraithrun -- --explain-effective-config --profile local-lab
Generate starter config:
cargo run -p wraithrun -- --init-config