Some checks are pending
build-prerelease / Publish to rpm.lair.cafe (unstable) (push) Blocked by required conditions
build-prerelease / Resolve version stamps + change detection (push) Successful in 31s
build-prerelease / Lint (fmt + clippy) (push) Successful in 2m21s
build-prerelease / Build cortex binary (push) Successful in 2m27s
build-prerelease / Build helexa-bench binary (push) Successful in 2m44s
build-prerelease / Test (push) Successful in 4m32s
build-prerelease / Build neuron-ampere (push) Successful in 2m7s
build-prerelease / Build neuron-ada (push) Successful in 2m28s
build-prerelease / Build neuron-blackwell (push) Successful in 2m59s
build-prerelease / Package cortex RPM (push) Successful in 1m20s
build-prerelease / Package helexa-bench RPM (push) Successful in 1m19s
build-prerelease / Package helexa-neuron-ada RPM (push) Successful in 1m39s
build-prerelease / Package helexa-neuron-ampere RPM (push) Successful in 1m39s
build-prerelease / Package helexa-neuron-blackwell RPM (push) Successful in 1m42s
Part A — helexa-bench read API: - [api] config (enabled, listen :13132); WAL on the store so API reads never block the sweep writer. - store read methods: summary, series (chronological per-build medians), runs (filtered), dimensions, run_count. - api.rs: axum /api/health|dimensions|summary|series|runs, permissive CORS (UI is a separate origin). The `run` daemon binds the API alongside the sweep; new `serve` subcommand serves API-only. - listener plumbing (bench gains a port): data/helexa-bench-firewalld.xml, spec install, deploy-bench /api/health probe + firewalld step, sudoers firewall-cmd grants, [api] in example + bob.toml. - 5 API tests + serve smoke. Part B — bench/ Vite + React-SWC-TS app (router, react-bootstrap, recharts): Overview (summary table), Trends (decode tok/s & TTFT across build SHAs), Runs (filterable explorer). Typed API client with VITE_API_BASE + dev proxy to bob. npm build/typecheck clean. Hosted separately from the API (per design); .gitignore excludes node_modules/dist. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
57 lines
1.9 KiB
TOML
57 lines
1.9 KiB
TOML
# helexa-bench — continuous, version-aware fleet benchmark harness.
|
|
#
|
|
# Hits each neuron directly, exercises warm models, and records every run
|
|
# with full build/version provenance into SQLite. Once a neuron build has
|
|
# `samples_per_version` results for a (model, scenario), later sweeps skip
|
|
# it until a new build SHA ships — so a steady fleet costs only cheap
|
|
# version polls.
|
|
#
|
|
# Env overrides: BENCH_-prefixed, `__` for nesting
|
|
# (e.g. BENCH_BENCH__SAMPLES_PER_VERSION=10).
|
|
|
|
[bench]
|
|
# Pause between full sweeps of all targets (seconds).
|
|
sweep_interval_secs = 1800
|
|
# Target measured samples per (target, build SHA, model, scenario).
|
|
samples_per_version = 5
|
|
# Pause between successive measured iterations against one model.
|
|
iteration_pause_secs = 2
|
|
# Per-request timeout (seconds); generous for cold lazy-loads.
|
|
request_timeout_secs = 600
|
|
# SQLite system-of-record.
|
|
db_path = "/var/lib/helexa-bench/bench.sqlite"
|
|
|
|
[scenarios]
|
|
# One chat-latency scenario is generated per size (chat:128, chat:4096).
|
|
prompt_sizes = [128, 4096]
|
|
max_tokens = 256
|
|
|
|
# Read-only JSON API (consumed by the bench UI + programmatic access),
|
|
# served alongside the sweep loop by `run` (or standalone via `serve`).
|
|
[api]
|
|
enabled = true
|
|
listen = "0.0.0.0:13132"
|
|
|
|
# One [[targets]] block per neuron on the fleet. `kind = "neuron"` (the
|
|
# default) gets build metadata via GET /version and warm-model discovery
|
|
# via GET /models.
|
|
[[targets]]
|
|
name = "beast"
|
|
endpoint = "http://beast.hanzalova.internal:13131"
|
|
|
|
[[targets]]
|
|
name = "benjy"
|
|
endpoint = "http://benjy.hanzalova.internal:13131"
|
|
|
|
[[targets]]
|
|
name = "quadbrat"
|
|
endpoint = "http://quadbrat.hanzalova.internal:13131"
|
|
|
|
# Future: compare against a non-neuron OpenAI-compatible engine. `kind =
|
|
# "openai"` skips neuron-only metadata; point `endpoint` at the /v1 base.
|
|
# [[targets]]
|
|
# name = "llamacpp-ref"
|
|
# kind = "openai"
|
|
# endpoint = "http://benjy.hanzalova.internal:8080/v1"
|
|
# label = "llama.cpp"
|