text_csv_utils
Helpers for parsing and writing legacy text generation CSV files.
Includes the canonical CSV→legacy-dict conversion that replicates convert.py's algorithm, plus CSV write-back and reverse-conversion functions for maintaining the CSV as the source of truth through write operations.
All backends that need to serve or compare text generation legacy data
should use csv_rows_to_legacy_dict rather than rolling their own conversion.
TEXT_CSV_FIELDNAMES
module-attribute
TEXT_CSV_FIELDNAMES: list[str] = [
"name",
"parameters_bn",
"display_name",
"url",
"baseline",
"description",
"style",
"tags",
"instruct_format",
"settings",
]
TextCSVRow
dataclass
Structured representation of a single legacy text CSV row.
Source code in src/horde_model_reference/legacy/text_csv_utils.py
TextCSVIssue
dataclass
Validation issue encountered while parsing a CSV row.
Source code in src/horde_model_reference/legacy/text_csv_utils.py
parse_legacy_text_csv
Parse legacy text-generation CSV data from a text stream into structured rows.
Source code in src/horde_model_reference/legacy/text_csv_utils.py
parse_legacy_text_csv_file
Parse legacy text-generation CSV data from a file path into structured rows.
Source code in src/horde_model_reference/legacy/text_csv_utils.py
_settings_value_types_valid
Validate that settings matches the supported flat structure.
Source code in src/horde_model_reference/legacy/text_csv_utils.py
csv_rows_to_legacy_dict
csv_rows_to_legacy_dict(
rows: list[TextCSVRow],
*,
with_backend_prefixes: bool = True,
) -> dict[str, Any]
Convert parsed CSV rows to legacy dict format, replicating convert.py exactly.
This is the single canonical implementation of the CSV→legacy-dict conversion. Field ordering, defaults merging, empty-value filtering, tag generation, and backend prefix duplication all match the upstream convert.py algorithm.
Parameters:
-
rows(list[TextCSVRow]) –Parsed CSV rows from
parse_legacy_text_csv. -
with_backend_prefixes(bool, default:True) –If True, generate 3 entries per model (base, aphrodite/, koboldcpp/) matching db.json format. If False, generate 1 entry per base model only.
Returns:
Source code in src/horde_model_reference/legacy/text_csv_utils.py
_parameters_to_bn_str
Convert integer parameter count to minimal billions string for CSV.
Uses simplest representation: 3000000000 → "3", 560000000 → "0.56".
Parameters:
-
parameters(int) –Integer parameter count.
Returns:
-
str–Minimal string representation in billions.
Source code in src/horde_model_reference/legacy/text_csv_utils.py
legacy_record_to_csv_row
Reverse-convert a db.json-format record to a TextCSVRow.
Strips auto-generated tags (style + size bucket) and reverses the parameter conversion so the CSV row round-trips through convert.py.
Parameters:
-
name(str) –The base model name (e.g., "Org/Model-7B").
-
record(dict[str, Any]) –A single model record from the legacy dict (db.json format).
Returns:
-
TextCSVRow–A TextCSVRow suitable for writing to CSV.
Source code in src/horde_model_reference/legacy/text_csv_utils.py
write_legacy_text_csv
Write TextCSVRow list to a CSV file in upstream models.csv format.
Parameters:
-
rows(list[TextCSVRow]) –The rows to write.
-
csv_path(Path) –Path to write the CSV file.