Integrations

The integrations subsystem fetches live runtime data from the AI-Horde public API and merges it with static model reference records. This enables analytics endpoints to report on worker counts, usage statistics, and queue depth alongside model metadata.

Data Flow

graph LR
    API[AI-Horde Public API] --> HPI[HordeAPIIntegration]
    HPI --> Indexed[Indexed Status & Stats]
    MR[Model Reference Records] --> DM[DataMerger]
    Indexed --> DM
    DM --> CMS[CombinedModelStatistics]
    CMS --> Endpoints[Statistics & Audit Endpoints]

HordeAPIIntegration

HordeAPIIntegration is a singleton that fetches and caches three types of runtime data from AI-Horde:

Endpoint	Data	Indexed By
`/v2/status/models`	Worker counts, queue depth, ETA, performance	Model name
`/v2/stats/models`	Usage counts (day, month, total)	Model name
`/v2/workers`	Worker details (online status, trust, uptime)	Worker ID

Each response type is modeled as Pydantic classes (HordeModelStatus, HordeModelStatsResponse, HordeWorker) and indexed into dictionaries keyed by model name for efficient lookup.

Caching

The integration uses the same dual-layer caching pattern as the rest of the system:

Redis (when available in PRIMARY mode) - shared cache with configurable TTL, key prefix {redis_key_prefix}:horde_api
In-memory - per-type dictionaries with timestamp tracking for TTL enforcement

The horde_api_cache_ttl setting (default 60s) controls how long API responses are cached. A force_refresh parameter on fetch methods bypasses the cache for background hydration.

Error Handling

API calls use httpx with a configurable timeout (horde_api_timeout, default 30s). Failures are logged and return empty results rather than propagating exceptions, so analytics endpoints degrade gracefully when the Horde API is unavailable.

Data Merger

DataMerger provides pure functions that combine static model reference data with runtime API data. The primary entry point is merge_category_with_horde_data():

def merge_category_with_horde_data(
    model_names: list[str],
    horde_status: IndexedHordeModelStatus,
    horde_stats: IndexedHordeModelStats,
    workers: IndexedHordeWorkers | None,
    include_backend_variations: bool = False,
) -> dict[str, CombinedModelStatistics]:

For each model name, the function:

Looks up the model's status (worker count, queue depth, ETA, performance)
Extracts usage statistics (day/month/total counts)
Optionally resolves worker details into WorkerSummary objects
For text models with include_backend_variations=True, splits statistics by backend (aphrodite, koboldcpp)

The result is a CombinedModelStatistics per model, containing:

Field	Source
`worker_count`	Computed from worker summaries or status count
`queued_jobs`, `performance`, `eta`	Horde model status
`usage_stats` (day, month, total)	Horde stats endpoint
`worker_summaries`	Horde workers endpoint (optional)
`backend_variations`	Per-backend breakdown for text models

Backend Variations

Text generation models can be served by different backends (aphrodite, koboldcpp). When include_backend_variations=True, the merger produces per-backend BackendVariation entries showing which workers use which backend and their individual usage counts. This powers the audit analysis view that identifies models with uneven backend distribution.

How Endpoints Use Merged Data

The statistics and audit analytics endpoints follow the same pattern:

Get model names and records from ModelReferenceManager
Fetch runtime data via HordeAPIIntegration
Merge with merge_category_with_horde_data()
Pass merged data to CategoryStatistics computation or ModelAuditInfoFactory
Cache the result in StatisticsCache or AuditCache

The Analytics Pipeline page covers the computation and caching layers in detail.

The Horde API data reflects a point-in-time snapshot. Worker counts and usage stats can change rapidly. The caching TTL represents a trade-off between freshness and API load -- production deployments should enable cache hydration to keep data warm without hammering the Horde API on every request.