Holaboss
Skip to content

Model Configuration

Model Settings

Holaboss ships with a default model setup. In most cases, you do not need to edit runtime-config.json by hand, because the desktop app already exposes the main configuration surfaces.

The baseline defaults are:

  • default model: openai/gpt-5.4
  • built-in fallback provider id when no configured default provider applies: openai

Supported provider styles

The OSS runtime supports the following provider kinds:

Provider kindBest fitNotes
holaboss_proxyHolaboss-managed setupsUseful when the desktop should talk through the Holaboss proxy layer.
openai_compatibleSelf-hosted or compatible APIsBest when you already have an OpenAI-style endpoint. Also covers Ollama.
anthropic_nativeAnthropic accounts and native workflowsGood when you want the provider's own integration path.
openrouterModel experimentation and broad accessUseful when you want to swap models without changing the rest of the setup.

The desktop settings UI (Settings -> Model Providers) lets you connect OpenAI, Anthropic, OpenRouter, Gemini, or Ollama without editing files manually.

In-app setup

Holaboss already provides model configuration in the desktop app:

  • open Settings -> Model Providers
  • connect a provider such as OpenAI, Anthropic, OpenRouter, Gemini, or Ollama
  • enter the API key and use the built-in provider defaults or edit the model list for that provider
  • use the Background tasks panel to choose one connected provider and model for recall selection, finalization, and evolve tasks
  • use the Recall embeddings panel to leave vector-assisted recall on Automatic or choose an explicit embedding-capable provider and model
  • changes autosave to runtime-config.json, and the chat model picker uses the configured provider models

When the first provider is connected, the desktop seeds background tasks to that provider and its built-in default background model. For ollama_direct, the provider can be selected, but you must choose a model explicitly before background LLM tasks are enabled.

Customization mode

You can configure the runtime in either of these modes:

  • legacy or proxy shorthand
    • set model_proxy_base_url, auth_token, and default_model
  • structured provider catalog
    • define providers and models, then set runtime.default_provider and runtime.default_model

Runtime URL behavior stays consistent across both modes:

  • if model_proxy_base_url is a proxy root, runtime appends provider routes such as /openai/v1 or /anthropic/v1
  • direct mode is enabled when you provide a provider endpoint
  • OpenAI-compatible direct providers typically use a /v1 endpoint, for example https://api.openai.com/v1
  • Anthropic native direct providers should use the root host, for example https://api.anthropic.com
  • known provider hosts normalize as needed, including Gemini host roots to /v1beta/openai

Where the runtime reads model config

The runtime resolves model settings from:

  1. runtime-config.json
  2. environment variables
  3. built-in defaults

By default, runtime-config.json lives at:

  • ${HB_SANDBOX_ROOT}/state/runtime-config.json

You can override that path with:

  • HOLABOSS_RUNTIME_CONFIG_PATH

Important settings

  • model_proxy_base_url
    • legacy or proxy base URL root
  • auth_token
    • proxy token sent as X-API-Key
  • providers.<id>.base_url
    • direct provider endpoint
  • providers.<id>.api_key
    • direct provider credential
  • runtime.background_tasks.provider
    • provider for recall selection, finalization, and evolve tasks
  • runtime.background_tasks.model
    • model id used for that background provider
  • runtime.recall_embeddings.provider
    • optional embedding-capable provider used for vector-assisted recall candidate narrowing
  • runtime.recall_embeddings.model
    • optional embedding model id used for recall embeddings
  • runtime.default_provider
    • configured default provider for unprefixed model ids
  • runtime.default_model
    • default model selection
  • HOLABOSS_DEFAULT_MODEL
    • environment override for the default model

Background task provider defaults

When you choose a provider in the desktop Background tasks panel, the app seeds the model field with these defaults:

  • holaboss_model_proxy: gpt-5.4-mini
  • openai_direct: gpt-5.4-mini
  • anthropic_direct: claude-sonnet-4-6
  • openrouter_direct: openai/gpt-5.4-mini
  • gemini_direct: gemini-2.5-flash
  • minimax_direct: MiniMax-M2.7
  • ollama_direct: no default; choose a model explicitly

Recall embedding defaults

When Recall embeddings is left on Automatic, runtime chooses the first configured provider that has a built-in embedding default. If you configure it explicitly, the current built-in embedding defaults are:

  • openai_direct: text-embedding-3-small
  • openrouter_direct: openai/text-embedding-3-small

Other providers currently need an explicit compatible embedding model selection before vector-assisted recall is enabled.

Model string format

Use provider-prefixed model ids when you want to be explicit:

  • openai/gpt-5.4
  • anthropic/claude-sonnet-4-20250514
  • openrouter/deepseek/deepseek-chat-v3-0324

The runtime also treats unprefixed claude... model ids as Anthropic models. If a model id is unprefixed and does not start with claude, the runtime first tries the configured default provider. If no configured default provider applies, it falls back to openai/<model>.

runtime-config.json universal provider example

json
{
  "runtime": {
    "default_provider": "openai_direct",
    "default_model": "openai/gpt-5.4",
    "background_tasks": {
      "provider": "openai_direct",
      "model": "gpt-5.4-mini"
    },
    "recall_embeddings": {
      "provider": "openai_direct",
      "model": "text-embedding-3-small"
    }
  },
  "providers": {
    "openai_direct": {
      "kind": "openai_compatible",
      "base_url": "https://api.openai.com/v1",
      "api_key": "sk-your-openai-key"
    },
    "anthropic_direct": {
      "kind": "anthropic_native",
      "base_url": "https://api.anthropic.com",
      "api_key": "sk-ant-your-anthropic-key"
    },
    "openrouter_direct": {
      "kind": "openrouter",
      "base_url": "https://openrouter.ai/api/v1",
      "api_key": "sk-or-your-openrouter-key"
    },
    "ollama_direct": {
      "kind": "openai_compatible",
      "base_url": "http://localhost:11434/v1",
      "api_key": "ollama"
    }
  },
  "models": {
    "openai_direct/gpt-5.4": {
      "provider": "openai_direct",
      "model": "gpt-5.4"
    },
    "openai_direct/gpt-5.4-mini": {
      "provider": "openai_direct",
      "model": "gpt-5.4-mini"
    },
    "anthropic_direct/claude-sonnet-4-6": {
      "provider": "anthropic_direct",
      "model": "claude-sonnet-4-6"
    },
    "openrouter_direct/openai/gpt-5.4": {
      "provider": "openrouter_direct",
      "model": "openai/gpt-5.4"
    },
    "ollama_direct/qwen2.5:0.5b": {
      "provider": "ollama_direct",
      "model": "qwen2.5:0.5b"
    }
  }
}

Provider kind values supported by the runtime resolver:

  • holaboss_proxy
  • openai_compatible
  • anthropic_native
  • openrouter

Verify Ollama through the desktop UI

This is the simplest end-to-end check for the local ollama_direct path:

  1. Install and start Ollama on your machine.
  2. Pull a small local model:
bash
ollama pull qwen2.5:0.5b
  1. Launch the desktop app.
  2. Open Settings -> Model Providers.
  3. Connect Ollama with:
    • base URL: http://localhost:11434/v1
    • API key: ollama
    • models: qwen2.5:0.5b
  4. Open a workspace chat and select ollama_direct/qwen2.5:0.5b.
  5. Send this prompt:
text
Reply with exactly: OK

Expected result:

  • the run starts with provider ollama_direct
  • the model resolves to qwen2.5:0.5b
  • the assistant replies with OK

If the model does not show up or the request fails, verify Ollama directly first:

bash
curl http://localhost:11434/v1/chat/completions \
  -H 'Content-Type: application/json' \
  -H 'Authorization: Bearer ollama' \
  -d '{"model":"qwen2.5:0.5b","messages":[{"role":"user","content":"Reply with exactly: OK"}],"temperature":0}'

Environment overrides

bash
export HOLABOSS_MODEL_PROXY_BASE_URL="https://your-proxy.example/api/v1/model-proxy"
export HOLABOSS_SANDBOX_AUTH_TOKEN="your-proxy-token"
export HOLABOSS_DEFAULT_MODEL="anthropic/claude-sonnet-4-20250514"

These environment variables override the file-based values above. sandbox_id still needs to come from runtime-config.json.