donkai.org/runtime · Order of Honest Labels

Your local AI hub — like Ollama, orchestrated by Finn.

Donkai Runtime wraps Ollama, Grok, OpenAI, and Finn behind one OpenAI-compatible API on your machine. Persistent memory, NVIDIA GPU offload, and the full Finn Oracle — without rebuilding Ollama.

Gateway MVP LIVE local :7720 Finn :7700

Local gateway health → Finn operator UI → Architecture doc

Feature status

Quick start

  1. Install Ollama (NVIDIA GPU auto-detected)
  2. Pull a model: ollama pull llama3.2:3b
  3. Start Finn: pm2 start ecosystem.finn.config.cjs --only finn-genesis-runtime
  4. Start gateway: pm2 start ecosystem.finn.config.cjs --only donkai-runtime
  5. Chat: curl http://127.0.0.1:7720/v1/chat/completions -d '{"model":"ollama/llama3.2:3b","messages":[{"role":"user","content":"hi"}]}'

Env vars (local only, never commit): XAI_API_KEY, OPENAI_API_KEY, optional DONKAI_GATEWAY_PORT

Model catalog

Loaded from builder-models.json. Ollama tags discovered live when gateway is running.

Loading…

API routing

POST http://127.0.0.1:7720/v1/chat/completions { "model": "finn/operator", "messages": [{"role":"user","content":"recall last session"}] }

Finn + NVIDIA