Skip to content

Design Decisions & Known Limitations

This page explains user-visible design trade-offs so you know what to expect when using the library.

Singleton Manager

ModelReferenceManager is a singleton. The first instantiation locks in all configuration (backend, prefetch strategy, etc.). Subsequent calls with the same parameters return the existing instance; calls with different parameters raise RuntimeError.

Why: The manager owns caches, backend connections, and download state. Multiple instances with conflicting settings would lead to race conditions, duplicate downloads, and inconsistent cached data.

What to do: Initialize once at application startup. Retrieve the instance elsewhere with ModelReferenceManager() (no args) or ModelReferenceManager.get_instance().

or_none Methods

Methods named *_or_none return None on failure instead of raising it. The name means "the return type includes None, so you must handle that case."

Method Returns On failure
get_model_reference(cat) dict[str, GenericModelRecord] Raises RuntimeError
get_model_reference_or_none(cat) dict[str, GenericModelRecord] \| None Returns None

Why: Some consumers (workers) want guaranteed data and prefer exceptions. Others (dashboards) want graceful degradation. Both patterns are supported.

Environment-Driven Configuration

Settings are read from environment variables (prefix HORDE_MODEL_REFERENCE_) at import time via the Pydantic Settings singleton. This means:

  • Changing an env var after import has no effect on the already-created settings object
  • The settings object validates combinations on creation (e.g., warns if REPLICA mode has Redis enabled)

Why: Import-time configuration is standard for Pydantic Settings and avoids the complexity of runtime reconfiguration with an already-initialized singleton manager.

Async / Sync Separation

The library provides parallel sync and async method sets (get_model_reference / get_model_reference_async). You should not mix them in the same execution context.

Why: The backends use different HTTP client implementations (httpx.Client vs httpx.AsyncClient). Calling sync methods from within an async context (or vice versa) can block the event loop or cause RuntimeError from nested event loop usage.

Return Type Precision

get_model_reference() returns dict[str, GenericModelRecord] even though the actual records are specialized subclasses (e.g., ImageGenerationModelRecord). This is because the method accepts any category.

Workarounds: - Use the typed properties: manager.image_generation_models returns dict[str, ImageGenerationModelRecord] - Use isinstance() checks for type narrowing - Use the query API, which is fully typed per category