Model Configuration

Model Settings

Holaboss ships with a default model setup. In most cases, you do not need to edit runtime-config.json by hand, because the desktop app already exposes the main configuration surfaces.

The baseline defaults are:

default model: openai/gpt-5.4
built-in fallback provider id when no configured default provider applies: openai

Supported provider styles

The OSS runtime supports the following provider kinds:

Provider kind	Best fit	Notes
`holaboss_proxy`	Holaboss-managed setups	Useful when the desktop should talk through the Holaboss proxy layer.
`openai_compatible`	Self-hosted or compatible APIs	Best when you already have an OpenAI-style endpoint. Also covers Ollama.
`anthropic_native`	Anthropic accounts and native workflows	Good when you want the provider's own integration path.
`openrouter`	Model experimentation and broad access	Useful when you want to swap models without changing the rest of the setup.

The desktop settings UI (Settings -> Model Providers) lets you connect OpenAI, Anthropic, OpenRouter, Gemini, or Ollama without editing files manually.

In-app setup

Holaboss already provides model configuration in the desktop app:

open Settings -> Model Providers
connect a provider such as OpenAI, Anthropic, OpenRouter, Gemini, or Ollama
enter the API key and use the built-in provider defaults or edit the model list for that provider
use the Background tasks panel to choose one connected provider and model for recall selection, finalization, and evolve tasks
use the Recall embeddings panel to leave vector-assisted recall on Automatic or choose an explicit embedding-capable provider and model
changes autosave to runtime-config.json, and the chat model picker uses the configured provider models

When the first provider is connected, the desktop seeds background tasks to that provider and its built-in default background model. For ollama_direct, the provider can be selected, but you must choose a model explicitly before background LLM tasks are enabled.

Customization mode

You can configure the runtime in either of these modes:

legacy or proxy shorthand
- set model_proxy_base_url, auth_token, and default_model
structured provider catalog
- define providers and models, then set runtime.default_provider and runtime.default_model

Runtime URL behavior stays consistent across both modes:

if model_proxy_base_url is a proxy root, runtime appends provider routes such as /openai/v1 or /anthropic/v1
direct mode is enabled when you provide a provider endpoint
OpenAI-compatible direct providers typically use a /v1 endpoint, for example https://api.openai.com/v1
Anthropic native direct providers should use the root host, for example https://api.anthropic.com
known provider hosts normalize as needed, including Gemini host roots to /v1beta/openai

Where the runtime reads model config

The runtime resolves model settings from:

runtime-config.json
environment variables
built-in defaults

By default, runtime-config.json lives at:

${HB_SANDBOX_ROOT}/state/runtime-config.json

You can override that path with:

HOLABOSS_RUNTIME_CONFIG_PATH

Important settings

model_proxy_base_url
- legacy or proxy base URL root
auth_token
- proxy token sent as X-API-Key
providers.<id>.base_url
- direct provider endpoint
providers.<id>.api_key
- direct provider credential
runtime.background_tasks.provider
- provider for recall selection, finalization, and evolve tasks
runtime.background_tasks.model
- model id used for that background provider
runtime.recall_embeddings.provider
- optional embedding-capable provider used for vector-assisted recall candidate narrowing
runtime.recall_embeddings.model
- optional embedding model id used for recall embeddings
runtime.default_provider
- configured default provider for unprefixed model ids
runtime.default_model
- default model selection
HOLABOSS_DEFAULT_MODEL
- environment override for the default model

Background task provider defaults

When you choose a provider in the desktop Background tasks panel, the app seeds the model field with these defaults:

holaboss_model_proxy: gpt-5.4-mini
openai_direct: gpt-5.4-mini
anthropic_direct: claude-sonnet-4-6
openrouter_direct: openai/gpt-5.4-mini
gemini_direct: gemini-2.5-flash
minimax_direct: MiniMax-M2.7
ollama_direct: no default; choose a model explicitly

Recall embedding defaults

When Recall embeddings is left on Automatic, runtime chooses the first configured provider that has a built-in embedding default. If you configure it explicitly, the current built-in embedding defaults are:

openai_direct: text-embedding-3-small
openrouter_direct: openai/text-embedding-3-small

Other providers currently need an explicit compatible embedding model selection before vector-assisted recall is enabled.

Model string format

Use provider-prefixed model ids when you want to be explicit:

openai/gpt-5.4
anthropic/claude-sonnet-4-20250514
openrouter/deepseek/deepseek-chat-v3-0324

The runtime also treats unprefixed claude... model ids as Anthropic models. If a model id is unprefixed and does not start with claude, the runtime first tries the configured default provider. If no configured default provider applies, it falls back to openai/<model>.

`runtime-config.json` universal provider example

json

{
  "runtime": {
    "default_provider": "openai_direct",
    "default_model": "openai/gpt-5.4",
    "background_tasks": {
      "provider": "openai_direct",
      "model": "gpt-5.4-mini"
    },
    "recall_embeddings": {
      "provider": "openai_direct",
      "model": "text-embedding-3-small"
    }
  },
  "providers": {
    "openai_direct": {
      "kind": "openai_compatible",
      "base_url": "https://api.openai.com/v1",
      "api_key": "sk-your-openai-key"
    },
    "anthropic_direct": {
      "kind": "anthropic_native",
      "base_url": "https://api.anthropic.com",
      "api_key": "sk-ant-your-anthropic-key"
    },
    "openrouter_direct": {
      "kind": "openrouter",
      "base_url": "https://openrouter.ai/api/v1",
      "api_key": "sk-or-your-openrouter-key"
    },
    "ollama_direct": {
      "kind": "openai_compatible",
      "base_url": "http://localhost:11434/v1",
      "api_key": "ollama"
    }
  },
  "models": {
    "openai_direct/gpt-5.4": {
      "provider": "openai_direct",
      "model": "gpt-5.4"
    },
    "openai_direct/gpt-5.4-mini": {
      "provider": "openai_direct",
      "model": "gpt-5.4-mini"
    },
    "anthropic_direct/claude-sonnet-4-6": {
      "provider": "anthropic_direct",
      "model": "claude-sonnet-4-6"
    },
    "openrouter_direct/openai/gpt-5.4": {
      "provider": "openrouter_direct",
      "model": "openai/gpt-5.4"
    },
    "ollama_direct/qwen2.5:0.5b": {
      "provider": "ollama_direct",
      "model": "qwen2.5:0.5b"
    }
  }
}

Provider kind values supported by the runtime resolver:

holaboss_proxy
openai_compatible
anthropic_native
openrouter

Verify Ollama through the desktop UI

This is the simplest end-to-end check for the local ollama_direct path:

Install and start Ollama on your machine.
Pull a small local model:

bash

ollama pull qwen2.5:0.5b

Launch the desktop app.
Open Settings -> Model Providers.
Connect Ollama with:
- base URL: http://localhost:11434/v1
- API key: ollama
- models: qwen2.5:0.5b
Open a workspace chat and select ollama_direct/qwen2.5:0.5b.
Send this prompt:

text

Reply with exactly: OK

Expected result:

the run starts with provider ollama_direct
the model resolves to qwen2.5:0.5b
the assistant replies with OK

If the model does not show up or the request fails, verify Ollama directly first:

bash

curl http://localhost:11434/v1/chat/completions \
  -H 'Content-Type: application/json' \
  -H 'Authorization: Bearer ollama' \
  -d '{"model":"qwen2.5:0.5b","messages":[{"role":"user","content":"Reply with exactly: OK"}],"temperature":0}'

Environment overrides

bash

export HOLABOSS_MODEL_PROXY_BASE_URL="https://your-proxy.example/api/v1/model-proxy"
export HOLABOSS_SANDBOX_AUTH_TOKEN="your-proxy-token"
export HOLABOSS_DEFAULT_MODEL="anthropic/claude-sonnet-4-20250514"

These environment variables override the file-based values above. sandbox_id still needs to come from runtime-config.json.

Model Configuration ​

Supported provider styles ​

In-app setup ​

Customization mode ​

Where the runtime reads model config ​

Important settings ​

Background task provider defaults ​

Recall embedding defaults ​

Model string format ​

runtime-config.json universal provider example ​

Verify Ollama through the desktop UI ​

Environment overrides ​

Model Configuration

Supported provider styles

In-app setup

Customization mode

Where the runtime reads model config

Important settings

Background task provider defaults

Recall embedding defaults

Model string format

`runtime-config.json` universal provider example

Verify Ollama through the desktop UI

Environment overrides