Files
helexa/helexa-bench.example.toml
rob thijssen f50f5531cf
Some checks are pending
build-prerelease / Publish to rpm.lair.cafe (unstable) (push) Blocked by required conditions
build-prerelease / Resolve version stamps + change detection (push) Successful in 31s
build-prerelease / Lint (fmt + clippy) (push) Successful in 2m21s
build-prerelease / Build cortex binary (push) Successful in 2m27s
build-prerelease / Build helexa-bench binary (push) Successful in 2m44s
build-prerelease / Test (push) Successful in 4m32s
build-prerelease / Build neuron-ampere (push) Successful in 2m7s
build-prerelease / Build neuron-ada (push) Successful in 2m28s
build-prerelease / Build neuron-blackwell (push) Successful in 2m59s
build-prerelease / Package cortex RPM (push) Successful in 1m20s
build-prerelease / Package helexa-bench RPM (push) Successful in 1m19s
build-prerelease / Package helexa-neuron-ada RPM (push) Successful in 1m39s
build-prerelease / Package helexa-neuron-ampere RPM (push) Successful in 1m39s
build-prerelease / Package helexa-neuron-blackwell RPM (push) Successful in 1m42s
feat(bench): read-only JSON API on bob + bench/ React visualisation app
Part A — helexa-bench read API:
- [api] config (enabled, listen :13132); WAL on the store so API reads
  never block the sweep writer.
- store read methods: summary, series (chronological per-build medians),
  runs (filtered), dimensions, run_count.
- api.rs: axum /api/health|dimensions|summary|series|runs, permissive
  CORS (UI is a separate origin). The `run` daemon binds the API
  alongside the sweep; new `serve` subcommand serves API-only.
- listener plumbing (bench gains a port): data/helexa-bench-firewalld.xml,
  spec install, deploy-bench /api/health probe + firewalld step, sudoers
  firewall-cmd grants, [api] in example + bob.toml.
- 5 API tests + serve smoke.

Part B — bench/ Vite + React-SWC-TS app (router, react-bootstrap,
recharts): Overview (summary table), Trends (decode tok/s & TTFT across
build SHAs), Runs (filterable explorer). Typed API client with
VITE_API_BASE + dev proxy to bob. npm build/typecheck clean. Hosted
separately from the API (per design); .gitignore excludes node_modules/dist.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 11:26:55 +03:00

57 lines
1.9 KiB
TOML

# helexa-bench — continuous, version-aware fleet benchmark harness.
#
# Hits each neuron directly, exercises warm models, and records every run
# with full build/version provenance into SQLite. Once a neuron build has
# `samples_per_version` results for a (model, scenario), later sweeps skip
# it until a new build SHA ships — so a steady fleet costs only cheap
# version polls.
#
# Env overrides: BENCH_-prefixed, `__` for nesting
# (e.g. BENCH_BENCH__SAMPLES_PER_VERSION=10).
[bench]
# Pause between full sweeps of all targets (seconds).
sweep_interval_secs = 1800
# Target measured samples per (target, build SHA, model, scenario).
samples_per_version = 5
# Pause between successive measured iterations against one model.
iteration_pause_secs = 2
# Per-request timeout (seconds); generous for cold lazy-loads.
request_timeout_secs = 600
# SQLite system-of-record.
db_path = "/var/lib/helexa-bench/bench.sqlite"
[scenarios]
# One chat-latency scenario is generated per size (chat:128, chat:4096).
prompt_sizes = [128, 4096]
max_tokens = 256
# Read-only JSON API (consumed by the bench UI + programmatic access),
# served alongside the sweep loop by `run` (or standalone via `serve`).
[api]
enabled = true
listen = "0.0.0.0:13132"
# One [[targets]] block per neuron on the fleet. `kind = "neuron"` (the
# default) gets build metadata via GET /version and warm-model discovery
# via GET /models.
[[targets]]
name = "beast"
endpoint = "http://beast.hanzalova.internal:13131"
[[targets]]
name = "benjy"
endpoint = "http://benjy.hanzalova.internal:13131"
[[targets]]
name = "quadbrat"
endpoint = "http://quadbrat.hanzalova.internal:13131"
# Future: compare against a non-neuron OpenAI-compatible engine. `kind =
# "openai"` skips neuron-only metadata; point `endpoint` at the /v1 base.
# [[targets]]
# name = "llamacpp-ref"
# kind = "openai"
# endpoint = "http://benjy.hanzalova.internal:8080/v1"
# label = "llama.cpp"