Primary Deployments
Transitioning to a new model reference system
Historically, models have been managed via GitHub repositories (image models, text models). This approach has had limitations which are mitigated by github actions and manual review, but a more robust solution is needed for scaling to new model categories and more frequent updates.
The nature of the horde is such that we have many (many) third-party integrations which have hardcoded references to these github repositories. To avoid breaking these integrations, we are introducing a new model reference system which supports both the legacy github format and a new v2 format, while also providing a REST API for model reference access. Further, until we completely deprecate the github repositories, the new system will keep the github repositories in sync with the new system. You can see more details about that in the sync readme in the scripts folder.
Adopting the v1 API is not recommended for new integrations, as it will eventually be deprecated. However, existing integrations and drop-in replace their references to github with calls to the v1 API without any other changes. Legacy filenames (stable_diffusion.json for image, db.json for text) are supported and the returned data is in the same format, order, etc, as the github repositories.
Service Architecture Overview
PRIMARY vs REPLICA Mode
The package has two operational modes:
PRIMARY Mode (Server)
You only need this if you are deploying your own Horde
- Authoritative source for model references
- Supports CRUD operations (Create, Read, Update, Delete)
- Can use Redis for distributed caching across multiple workers
- Optionally seeds initial data from GitHub legacy repositories
- Serves REPLICA clients via REST API
REPLICA Mode (Client)
- Fetches model references from PRIMARY API or GitHub
- Read-only access
- Local file-based caching with TTL
- Automatic GitHub fallback if PRIMARY is unavailable
- Used by workers, clients, and integrations
Configuration:
# REPLICA mode (default) - for workers/clients
export HORDE_MODEL_REFERENCE_REPLICATE_MODE=REPLICA
export HORDE_MODEL_REFERENCE_PRIMARY_API_URL=https://models.aihorde.net/
# PRIMARY mode - for server deployment
export HORDE_MODEL_REFERENCE_REPLICATE_MODE=PRIMARY
export HORDE_MODEL_REFERENCE_REDIS__USE_REDIS=true # for multi-worker
Backend Architecture
The package uses a pluggable backend system:
| Backend | Mode | Purpose | Use Case |
|---|---|---|---|
| FileSystemBackend | PRIMARY | Direct file I/O | Single-worker PRIMARY server |
| RedisBackend | PRIMARY | Distributed cache wrapper | Multi-worker PRIMARY server |
| GitHubBackend | REPLICA | GitHub downloads | REPLICA without PRIMARY API |
| HTTPBackend | REPLICA | PRIMARY API + fallback | REPLICA with PRIMARY API (recommended) |
Backend Selection (Automatic):
# Determined by environment variables:
# PRIMARY + Redis → RedisBackend(FileSystemBackend)
# PRIMARY + No Redis → FileSystemBackend
# REPLICA + primary_api_url → HTTPBackend
# REPLICA + No primary_api_url → GitHubBackend
Canonical Format Architecture
The package supports two file formats:
legacy: Original GitHub repository format (flat dictionary, single JSON file per category)v2: New standardized format (enhanced metadata, schema versioning)
The CANONICAL_FORMAT setting determines which format is authoritative:
# v2 format (default/recommended)
export HORDE_MODEL_REFERENCE_CANONICAL_FORMAT=v2
# - v2 API has CRUD operations
# - v1 API is read-only (serves converted data)
# legacy format (for backward compatibility)
export HORDE_MODEL_REFERENCE_CANONICAL_FORMAT=LEGACY
# - v1 API has CRUD operations
# - v2 API is read-only (serves converted data)
Both formats can be read by both API versions - this enables gradual migration.
When PRIMARY mode is configured with a canonical format that allows writes, enable the pending queue for multi-stage approvals. See Pending Queue Architecture for the router endpoints, auth lists, and storage requirements that keep staged changes isolated until they are applied.
Model Categories
The package manages multiple model categories:
from horde_model_reference import MODEL_REFERENCE_CATEGORY
print(list(MODEL_REFERENCE_CATEGORY))
# Output:
# - image_generation: Stable Diffusion, FLUX, etc.
# - text_generation: LLMs (LLaMA, GPT, etc.)
# - clip: Text-image embedding models
# - controlnet: Image control models (canny, depth, etc.)
# - blip: Image captioning models
# - esrgan: Image upscaling models
# - gfpgan: Face restoration models
# - codeformer: Face restoration models
# - safety_checker: NSFW detection models
# - video_generation: Video generation models (future)
# - audio_generation: Audio generation models (future)
# - miscellaneous: Other models