Skip to content

Slot as model

hal0’s OpenAI-compatible API accepts two kinds of identifiers in the model field:

  • A registry ref, e.g. "qwen2.5-0.5b-instruct-q4_k_m". Picks that exact model file.
  • A slot name, e.g. "primary". Picks whatever model is currently loaded in that slot.

Both routes go through the dispatcher. Slot names are stable; registry refs change every time you pull a newer quant. Most clients should address slots, not registry refs.

Imagine you’re a Python script that wants “the chat model”:

from openai import OpenAI
client = OpenAI(base_url="http://localhost:8080/v1", api_key="local")
# Stable across model swaps:
resp = client.chat.completions.create(
model="primary",
messages=[{"role": "user", "content": "Hello!"}],
)

Tomorrow you swap primary from a 7B Qwen-Coder to a 70B Hermes-4 and the script still works. Address by slot, swap models independently of clients.

When you want a specific model (benchmarking two quants side-by-side, say), use the registry ref:

resp = client.chat.completions.create(
model="qwen2.5-0.5b-instruct-q4_k_m",
messages=[{"role": "user", "content": "Hello!"}],
)

The dispatcher resolves the ref, picks the slot that owns it, and proxies the request. If the model isn’t loaded in any slot, you get a structured model.not_loaded error.

GET /v1/models returns both flavours so the client can discover either route:

{
"object": "list",
"data": [
{"id": "primary", "object": "model", "owned_by": "hal0"},
{"id": "embed", "object": "model", "owned_by": "hal0"},
{"id": "stt", "object": "model", "owned_by": "hal0"},
{"id": "tts", "object": "model", "owned_by": "hal0"},
{"id": "img", "object": "model", "owned_by": "hal0"},
{"id": "qwen2.5-0.5b-instruct-q4_k_m", "object": "model", "owned_by": "hal0"}
]
}

Slot aliases (primary, embed, stt, tts, img) are listed first because they’re stable across model swaps. The slot-scoped form primary:qwen2.5-0.5b-instruct-q4_k_m is also accepted and resolves the registry ref through the named slot.

You sent…Dispatcher resolves to…
"primary"The model currently in the primary slot.
"embed"The model currently in the embed slot.
"stt"The model currently in the stt slot.
"tts"The model currently in the tts slot.
"img"The model currently in the img slot.
A custom slot nameThat slot’s current model.
"<slot>:<ref>"The named slot, resolving the registry ref through it.
A registry refThe slot that currently owns that ref.
An external upstream refThe upstream provider (OpenRouter, Anthropic, OpenAI, …).
Anything else{"error": {"code": "model.not_found", ...}}