Architecture Overview
Horde Model Reference serves three roles from a single codebase: a Python library for querying model metadata, a FastAPI service for HTTP access and CRUD operations, and a sync tool for keeping legacy GitHub repositories up to date. Each role shares the same backbone modules and backend system but activates different subsystems.
graph LR
subgraph Consumers
A[Python Library]
B[FastAPI Service]
C[Sync Tool]
end
A --> MRM[ModelReferenceManager]
B --> MRM
C --> MRM
MRM --> BE[Pluggable Backend]
BE --> FS[FileSystemBackend]
BE --> Redis[RedisBackend]
BE --> GH[GitHubBackend]
BE --> HTTP[HTTPBackend]
Backbone Modules
Four modules form the foundation that every other part of the codebase depends on. Understanding their layering is essential for navigating the project.
graph TD
MC[meta_consts] --> PC[path_consts]
MC --> MRR[model_reference_records]
PC --> MRM[model_reference_manager]
MRR --> MRM
MRM --> backends
MRM --> service
MRM --> sync
MRM --> analytics
meta_consts.py defines all domain enums (MODEL_REFERENCE_CATEGORY, MODEL_DOMAIN, MODEL_PURPOSE, baselines) and registries. CategoryDescriptor ties each category to its domain, purpose, GitHub source, and JSON filename. Every other module imports from here to route logic and validate data.
path_consts.py provides HordeModelReferencePaths, a singleton that computes every filesystem path (base, legacy, showcase, meta, audit, pending queue) and builds filename/URL dictionaries from CategoryDescriptor data. All backends and the service layer use it to locate files.
model_reference_records.py contains the Pydantic model hierarchy (GenericModelRecord and its specialized subclasses) and the @register_record_type(category) decorator that populates MODEL_RECORD_TYPE_LOOKUP. This is the schema contract that backends write to and consumers read from.
model_reference_manager.py hosts the ModelReferenceManager singleton, which orchestrates the read/write lifecycle. It selects the backend, wires audit and pending-queue services, and exposes the public API (get_all_model_references(), get_model_reference(category), get_model(category, name)) in both sync and async variants.
Subsystem Directory Map
| Directory | Purpose |
|---|---|
backends/ |
Pluggable data-source backends (filesystem, Redis, GitHub, HTTP) |
service/ |
FastAPI app factory, v1/v2 routers, statistics and pending-queue endpoints |
legacy/ |
Legacy format download, conversion, and validation |
audit/ |
Append-only audit trail (events, writer, reader, replay) |
pending_queue/ |
Propose / approve / apply change queue |
analytics/ |
Statistics computation, caching, audit analysis, text model parsing |
sync/ |
GitHub synchronization (comparator, PR creation, watch mode) |
integrations/ |
AI-Horde public API client, runtime data merger |
Backend Selection
The manager auto-selects a backend based on REPLICATE_MODE and Redis configuration:
| Configuration | Backend |
|---|---|
| PRIMARY without Redis | FileSystemBackend |
| PRIMARY with Redis | RedisBackend wrapping FileSystemBackend |
REPLICA with primary_api_url |
HTTPBackend (PRIMARY API + GitHub fallback) |
REPLICA without primary_api_url |
GitHubBackend only |
All backends implement the ModelReferenceBackend ABC. Capability checks like supports_writes() and supports_legacy_writes() let callers determine what operations are available at runtime.
Settings and Configuration
Configuration is environment-based via Pydantic Settings with the HORDE_MODEL_REFERENCE_ prefix. The settings singleton validates mode/backend combinations at startup and logs warnings for invalid combinations (e.g., REPLICA with Redis enabled). Cross-project settings are imported from haidra_core.
See Canonical Format for how the CANONICAL_FORMAT setting controls API write routing, and Primary Deployments for deployment-specific configuration.
Singleton Pattern
Both ModelReferenceManager and LegacyReferenceDownloadManager use a singleton pattern where the first instantiation locks all parameters. Subsequent instantiations with different parameters raise RuntimeError. This prevents multiple concurrent downloads, inconsistent base paths, and cache inconsistencies.
Caching Layers
Depending on the active backend, multiple caching layers may be stacked:
- ModelReferenceManager — top-level in-memory cache with TTL (wraps any backend)
- FileSystemBackend — file mtime tracking plus in-memory per-category cache
- RedisBackend — Redis shared cache delegating to FileSystemBackend on miss
- GitHubBackend — download, convert, write to disk, then in-memory cache
- HTTPBackend — in-memory cache with PRIMARY API as source and GitHub fallback
This multi-layer approach ensures clients get fast responses while maintaining data consistency across deployment topologies.