Model Reference Backend
Overview
ModelReferenceBackend is the abstract base class that defines the interface all backend implementations must fulfill. It establishes the contract for fetching, caching, and managing model reference data from various sources (GitHub, filesystem, HTTP APIs, databases, etc.).
Key responsibilities:
- Define the interface for data fetching (sync and async)
- Specify cache refresh semantics
- Provide hooks for optional features (writes, health checks, statistics)
- Enable pluggable backend architecture
Implementations:
ReplicaBackendBase- Abstract base with caching infrastructureHTTPBackend- Fetches from PRIMARY API with GitHub fallbackGitHubBackend- Downloads from GitHub repositoriesFileSystemBackend- Reads/writes local filesystem filesRedisBackend- Uses Redis for distributed caching (PRIMARY mode)
Implementation Checklist
See ModelReferenceBackend for full details.
When creating a new backend implementation:
Required Implementations
__init__()- Initialize with appropriate mode (callsuper().__init__(mode))fetch_category()- Sync data fetchingfetch_all_categories()- Sync batch fetchingfetch_category_async()- Async data fetchingfetch_all_categories_async()- Async batch fetchingneeds_refresh()- Staleness detection (auto-provided byReplicaBackendBase)_mark_stale_impl()- Backend-specific staleness marking (auto-provided byReplicaBackendBase)get_category_file_path()- Return file path or Noneget_all_category_file_paths()- Return all file pathsget_legacy_json()- Legacy format retrievalget_legacy_json_string()- Legacy format string retrieval
Note:
ModelReferenceBackenddeclaresneeds_refresh()and_mark_stale_impl()as abstract, butReplicaBackendBasesupplies both implementations. If you subclassReplicaBackendBase(the recommended model), you only need to implement the fetching and file-path/legacy retrieval methods listed above.
Optional Implementations
supports_writes()+update_model()+delete_model()- If backend supports v2 writesupdate_model_from_base_model()- Automatically provided ifsupports_writes()returnsTruesupports_legacy_writes()+update_model_legacy()+delete_model_legacy()- If legacy writes neededupdate_model_legacy_from_base_model()- Automatically provided ifsupports_legacy_writes()returnsTruesupports_cache_warming()+warm_cache()+warm_cache_async()- If cache warming supportedsupports_health_checks()+health_check()- If health monitoring neededsupports_statistics()+get_statistics()- If statistics tracking desired
Best Practices
1. Extend ReplicaBackendBase for Caching
Don't implement ModelReferenceBackend directly. Use ReplicaBackendBase which provides:
- TTL-based caching
- File mtime validation
- Thread-safe locks
- Cache helper methods
_fetch_with_cache()to remove boilerplate around cache lookups
The notable exception would be backends that are themselves caching layers (e.g. RedisBackend).
See the ReplicaBackendBase documentation for details.
2. Use _fetch_with_cache() When Possible
If your backend simply needs to "return cached data unless forced to refetch, otherwise fetch and store", call _fetch_with_cache(category, fetch_fn, force_refresh=...). Provide a callable that performs the actual fetch and returns the parsed payload (or None). The helper checks _get_from_cache(), executes the callable on cache miss, stores the result via _store_in_cache(), and returns it. Use the more explicit patterns (locks, download + load, etc.) only when you need additional coordination around the fetch flow.
3. Honor force_refresh Parameter
Always respect the force_refresh parameter to bypass caches. See the fetch_category() documentation for requirements.
4. Handle Errors Gracefully
Return None on errors, don't raise exceptions from fetch methods. This allows callers to handle missing data gracefully.
5. Use Async Properly
In async methods, use async I/O and concurrent operations with asyncio.gather(). See fetch_all_categories_async() for implementation examples.
6. Implement Feature Detection
Always implement supports_*() methods before feature methods:
supports_writes()before write operationssupports_legacy_writes()before legacy operationssupports_cache_warming()before cache warmingsupports_health_checks()before health checkssupports_statistics()before statistics
7. Document Your Backend
Include clear docstrings explaining:
- What data source the backend uses
- What modes it supports (PRIMARY/REPLICA)
- What optional features it provides
- Any special configuration requirements
Important Design Constraints
These constraints are validated by the test suite:
1. Force Refresh Parameter
All fetch methods must support force_refresh to bypass backend-level caching. When True, perform a fresh fetch regardless of cache state.
2. Redownload Parameter for Legacy Methods
Legacy JSON methods should support the redownload parameter (analogous to force_refresh).
3. Async and Sync Cache Consistency
If your backend caches data, ensure async and sync methods share the same cache. ReplicaBackendBase handles this automatically.
4. Error Handling
Methods should handle errors gracefully and return None rather than raising exceptions.
5. Refresh Semantics
The needs_refresh() method should indicate when existing cached data has become stale, NOT when initial fetch is needed. See the method documentation for details.
6. Statistics Tracking (Optional)
If implementing supports_statistics(), track meaningful metrics like fetch counts, cache hits, fallback usage, error counts, and response times.
Audit Trail and Replay
The PRIMARY filesystem backend emits append-only JSONL audit events whenever a legacy record is created, updated, or deleted. Logs are written under horde_model_reference_paths.audit_path using the structure audit/<domain>/<category>/audit-000001.jsonl. Each line is a serialized AuditEvent that includes the operation, model name, logical Horde user id, and payload snapshot or delta.
Inspecting Events
Use the new scripts/audit_replay.py helper to stream events without writing ad-hoc parsers:
python scripts/audit_replay.py image_generation --domain legacy --start-event-id 10 --end-event-id 20 --pretty
Flags allow filtering by domain, category, specific model names, event id ranges, or timestamp ranges. The default output mode prints JSON lines for each matching event; pass --output state to reconstruct the final state of the selected category using the embedded AuditTrailReader and AuditReplayer.
Example to rebuild the current state for a subset of models:
These utilities operate entirely on the JSONL segments and do not require the service to be running, making them suitable for offline investigations or recovery workflows. Configure audit behavior via the HORDE_MODEL_REFERENCE_AUDIT__* environment variables (e.g. AUDIT__MAX_SEGMENT_BYTES, AUDIT__ROOT_PATH_OVERRIDE), and see Audit Trail Best Practices for operational tips.
Pending Queue Apply Workflow
PRIMARY deployments can gate all v2 writes through the pending queue to ensure multi-person review before model metadata is promoted. The queue keeps staged edits out of read APIs until an approver applies the change, and all audit trail writes continue to flow through PendingQueueService rather than the HTTP routers.
For an operator-focused playbook (storage layout, canonical format behavior, router entry points, and troubleshooting) see Pending Queue Architecture.
Deployment Constraints and Storage Isolation
- Enable the workflow by setting
HORDE_MODEL_REFERENCE_PENDING_QUEUE__ENABLED=truewhileHORDE_MODEL_REFERENCE_REPLICATE_MODE=PRIMARY. REPLICA nodes ignore the queue entirely and always treat v2 APIs as read-only. - Queue persistence defaults to
<cache_home>/pending_queue, but production deployments should configureHORDE_MODEL_REFERENCE_PENDING_QUEUE__ROOT_PATH_OVERRIDE(or adjust...RELATIVE_SUBDIR) so each deployment, environment, or test run has a dedicated directory. This mirrors the test fixture override that prevents cross-talk between suites. - Pending queue files are distinct from audit trail logs. Never co-locate
pending_queuedata under the audit path; the audit JSONL stream remains the only canonical record of applied operations.
Auth Lists and Workflow Roles
- Requestors submit batches via the write APIs once their Horde user id appears in
HORDE_MODEL_REFERENCE_PENDING_QUEUE__REQUESTOR_IDS. Approvers must include the requestor IDs and are configured with...APPROVER_IDSso approval permissions are a superset of submission permissions. - Provide these list settings as JSON arrays (e.g.
["user_a","user_b"]) when using environment variables. Use__(double underscore) to separate nesting levels from the field name when setting nested model fields via environment variables. - Because PRIMARY mode is the authoritative source, always double-check that queue approvers can reach the deployment that owns the filesystem backend; REPLICA nodes cannot apply or approve changes.
HTTP Apply Workflow
- The
pending_queuerouter registers before category routes and exposesGET /pending_queue/changes,GET /pending_queue/changes/{id},POST /pending_queue/batches,POST /pending_queue/changes/{id}/apply, andPOST /pending_queue/apply. - Every endpoint enforces
authenticate_queue_approver,assert_v2_write_enabled, andrequire_pending_queue_service, ensuring only PRIMARY deployments with pending-queue enabled and authorized users can mutate state. POST /pending_queue/changes/{id}/applyperforms a single apply by delegating toapply_pending_change(), which validates approval status, writes through the filesystem backend, marks the record as applied, and allows the backend to callmark_stale()so caches refresh on the next read.POST /pending_queue/applyaccepts{ "change_ids": [...], "job_id": "..." }, processes IDs sequentially viaapply_pending_changes(), and stops on the first backend failure. The response reportsapplied_change_ids,failed_change_ids, and serialized records so operators can retry without guessing intermediate state.- Router responses rely on
.model_dump(..., exclude_none=True)to prevent accidental audit duplication. All audit log writes remain inPendingQueueService, which already emits JSONL events alongside standard backend operations.
Operational Guardrails
- Pending queue data never feeds read APIs until a change transitions to
applied. If you observe pending data leaking, verify that cache directories differ per deployment and that only PRIMARY mode has writes enabled. - The pending queue is operated via HTTP endpoints only. On-call engineers should use the frontend UI or directly call the HTTP endpoints with the same payload the UI would send. Always include
job_idso audit investigations can pair queue actions with user intent.
Testing Your Backend
When implementing a new backend, ensure you test:
Core Functionality Tests
- Fetch operations:
fetch_category()returns correct datafetch_category()returnsNonefor unavailable categoriesfetch_all_categories()returns dict with all categories-
Async variants behave identically to sync variants
-
Cache behavior:
- First fetch populates cache
- Second fetch uses cached data (verify with call counters)
force_refresh=Truebypasses cache- TTL expiration triggers refetch
-
mark_stale()invalidates cache -
Helper methods (if using ReplicaBackendBase):
has_cached_data()returnsFalsebefore first fetch,Trueaftershould_fetch_data()returnsTruewhen cache is invalid or staleneeds_refresh()returnsFalsefor initial state,Truefor stale data
Write Operations Tests (if supported)
- Update operations:
update_model()creates new modelupdate_model()updates existing model- Cache is invalidated after update
-
Callbacks are notified after update
-
Delete operations:
delete_model()removes existing modeldelete_model()raisesKeyErrorfor non-existent model- Cache is invalidated after delete
- Callbacks are notified after delete
Semantic Correctness Tests
Test the semantic distinction for needs_refresh():
def test_needs_refresh_semantics(backend):
category = MODEL_REFERENCE_CATEGORY.image_generation
# Initially: no cache, needs_refresh should be False
assert not backend.has_cached_data(category)
assert not backend.needs_refresh(category)
# After storing: has cache, needs_refresh should be False (fresh)
backend._store_in_cache(category, {"test": "data"})
assert backend.has_cached_data(category)
assert not backend.needs_refresh(category)
# After marking stale: has cache, needs_refresh should be True
backend.mark_stale(category)
assert backend.has_cached_data(category)
assert backend.needs_refresh(category)
See tests/test_replica_backend_base.py, tests/test_http_backend.py, and tests/test_redis_backend.py for comprehensive examples.
Summary
Abstract Methods (Must Implement)
All backends must implement these methods from ModelReferenceBackend:
| Method | Purpose |
|---|---|
fetch_category() |
Fetch single category data |
fetch_all_categories() |
Fetch all categories data |
fetch_category_async() |
Async single category fetch |
fetch_all_categories_async() |
Async all categories fetch |
needs_refresh() |
Check if cached data is stale |
_mark_stale_impl() |
Backend-specific staleness marking |
get_category_file_path() |
Get file path for category |
get_all_category_file_paths() |
Get all file paths |
get_legacy_json() |
Get legacy format dict |
get_legacy_json_string() |
Get legacy format string |
Inheriting from
ReplicaBackendBasesatisfiesneeds_refresh()and_mark_stale_impl()automatically, leaving only the fetch/file-path methods for you to implement.
Optional Methods (Override If Needed)
| Feature | Detection Method | Implementation Methods |
|---|---|---|
| Writes | supports_writes() |
update_model(), delete_model() |
| Legacy Writes | supports_legacy_writes() |
update_model_legacy(), delete_model_legacy() |
| Cache Warming | supports_cache_warming() |
warm_cache(), warm_cache_async() |
| Health Checks | supports_health_checks() |
health_check() |
| Statistics | supports_statistics() |
get_statistics() |
Recommended Approach
- Extend
ReplicaBackendBaseinstead ofModelReferenceBackenddirectly - Implement required abstract methods using caching helpers like
_fetch_with_cache() - Override optional methods only if needed
- Follow implementation patterns from existing backends:
HTTPBackendFileSystemBackendGitHubBackendRedisBackend
See the ReplicaBackendBase documentation for details on the caching infrastructure.