Skip to content

Usage

Inclavate runs three processes: the orchestrator (routes inference), one or more nodes (each serves a range of transformer layers), and the dashboard UI.

On Windows use dppan-orchestrator.exe / dppan.exe in place of ./dppan-orchestrator / ./dppan.

Single machine

Open three terminals in the bundle directory.

1 — Orchestrator

bash
./dppan-orchestrator --port 8001 --admin-token mysecret

2 — Node (joins with a model)

bash
./dppan join --orchestrator http://localhost:8001 --model llama3.2:1b

3 — Dashboard

bash
./dppan ui --orchestrator http://localhost:8001 --port 3000

Open http://localhost:3000.

Multi-machine

Machine A runs the orchestrator; other machines join as nodes. Nodes need only outbound TCP to ports 8001 and 9001 on Machine A — no inbound ports, no public IP.

Machine A — orchestrator + UI

bash
./dppan-orchestrator --port 8001 --admin-token mysecret
./dppan ui --orchestrator http://localhost:8001 --port 3000

Machine B, C, … — use Machine A's LAN IP

bash
./dppan join --orchestrator http://192.168.1.10:8001 --model llama3.2:1b

Each node auto-detects its GPU, resolves the model via Ollama, and receives a layer range from the orchestrator automatically.

Chat (OpenAI-compatible)

bash
curl -X POST http://localhost:8001/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"llama3.2:1b","messages":[{"role":"user","content":"Hello!"}],"max_tokens":200}'

See the API reference for streaming, sessions, and the node/metrics endpoints.

Common flags

dppan-orchestrator

FlagEnvDefaultDescription
--portPORTrequiredREST API port
--admin-tokenDPPAN_ADMIN_TOKENnoneEnables the Admin tab in the dashboard
--models-configDPPAN_MODELS_CONFIG./config/models.tomlModel registry path
--log-levelLOG_LEVELinfotrace / debug / info / warn / error

dppan join

FlagDefaultDescription
--orchestratorrequiredOrchestrator URL, e.g. http://localhost:8001
--modelrequiredOllama model name or path to a .gguf file
--layersautoLayer range to serve, e.g. 0-16
--log-levelinfoVerbosity

dppan ui

FlagDefaultDescription
--orchestratorrequiredOrchestrator URL
--port3000Dashboard port
--host127.0.0.1Bind address — use 0.0.0.0 for LAN access

GPU troubleshooting

  • "CUDA error: the provided PTX was compiled with an unsupported toolchain" — the NVIDIA driver is too old; update to 576.02 or newer.
  • cudart64_*.dll / libcudart.so not found — install CUDA Toolkit 12.x. On Linux, add /usr/local/cuda/lib64 to LD_LIBRARY_PATH.
  • Falls back to CPU instead of GPU — run nvidia-smi; the GPU must be Turing or newer (GTX 10xx and older are unsupported).
  • macOS GPU unused — use the macos-arm64 build on Apple Silicon (Intel Macs run CPU-only).

Free to run · Proprietary