Skip to content

Model registry

The model registry is hal0’s on-disk index of every model it knows about: pulled, verified, and ready to be assigned to a slot. The index file is /var/lib/hal0/registry/registry.toml and survives hal0 update. The actual weights live under /mnt/ai-models/ (pull_root is /mnt/ai-models/local/). On a stock hal0 LXC that path is rw ZFS on the host, so the registry persists across reinstalls without any NFS gymnastics.

Dashboard /models view showing the registry table with chat, stt, tts, and rerank tagged rows plus size and backend per entry.

Slots reference models by short registry refs, not by file paths. That lets you:

  • Swap a slot’s model without rewriting paths.
  • Share a model across two slots (a small chat model serving two endpoints).
  • Update a model file in place (e.g. a newer quant of the same weights) without touching slot config.

The registry is also what the dashboard’s Models view enumerates and what /v1/models reflects to OpenAI-shaped clients.

Each entry is a row in /var/lib/hal0/registry/registry.toml plus the model file(s) on disk under /mnt/ai-models/. A row carries:

  • The registry ref (a slug, e.g. qwen2.5-0.5b-instruct-q4_k_m).
  • The on-disk path to the weights, usually one .gguf under /mnt/ai-models/local/.
  • Source coords (Hugging Face repo + file name when applicable).
  • SHA-256 of the weights.
  • Backend tags (which providers can serve it).

Entries are written atomically; a failed pull leaves no partial entry, so the registry never lies about what’s on disk.

Terminal window
hal0 model list

Or via the API:

Terminal window
curl http://localhost:8080/v1/models

The /v1/models response is OpenAI-shaped, with one entry per registry model plus one entry per loaded slot name (so model: "primary" works as a model identifier; see Slot as model).

Terminal window
hal0 model assign qwen2.5-0.5b-instruct-q4_k_m --slot primary

This updates the slot’s TOML config (atomically) and restarts the slot through the lifecycle (unloading → warming → ready).

Terminal window
hal0 model rm qwen2.5-0.5b-instruct-q4_k_m

The registry refuses to remove a model that’s currently assigned to a slot. Unload the slot or swap its model first.

A small curated catalogue ships with the installer (src/hal0/registry/curated.py). It’s how the hardware probe picks a starting model for each slot on a fresh install. The catalogue is not the registry; it’s a list of suggestions the FirstRun wizard draws from. Pulled models land in the registry; the curated list is a starting menu.

  • Registry import / export for moving a model collection between machines.
  • Pinning a model to a specific version of its source file.
  • Custom metadata fields per entry.