8 Commits

Author SHA1 Message Date
7f797b0265 ci: parallelise fmt/clippy/test and drop sccache install step
All checks were successful
CI / Format (push) Successful in 33s
CI / Clippy (push) Successful in 1m31s
CI / Test (push) Successful in 2m11s
CI / Build cortex SRPM (push) Has been skipped
CI / Publish cortex to COPR (push) Has been skipped
CI / Build neuron SRPM (push) Has been skipped
CI / Publish neuron to COPR (push) Has been skipped
CI / Bump version in source (push) Has been skipped
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-11 13:55:17 +03:00
5a0360c1d5 ci: use container runner labels for CI jobs
Some checks failed
CI / Format, lint, build, test (push) Successful in 4m20s
CI / Build cortex SRPM (push) Has been cancelled
CI / Build neuron SRPM (push) Has been cancelled
CI / Publish cortex to COPR (push) Has been cancelled
CI / Publish neuron to COPR (push) Has been cancelled
CI / Bump version in source (push) Has been cancelled
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-11 13:29:42 +03:00
472c0e8737 fix(rpm): ship firewalld service definitions with correct ports
Some checks failed
CI / Format, lint, build, test (push) Has been cancelled
CI / Build cortex SRPM (push) Has been cancelled
CI / Build neuron SRPM (push) Has been cancelled
CI / Publish cortex to COPR (push) Has been cancelled
CI / Publish neuron to COPR (push) Has been cancelled
CI / Bump version in source (push) Has been cancelled
cortex: opens 31313/tcp (API) and 31314/tcp (metrics)
neuron: opens 13131/tcp

Installs to /usr/lib/firewalld/services/ so firewall-cmd
--add-service=cortex / --add-service=helexa-neuron works
out of the box.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-11 12:52:20 +03:00
Gitea Actions
b9d8e30058 chore: bump version to 0.1.16 2026-04-16 15:04:21 +00:00
25f75fe552 chore: ignore local deploy script
All checks were successful
CI / Format, lint, build, test (push) Successful in 1m15s
CI / Build cortex SRPM (push) Successful in 43s
CI / Build neuron SRPM (push) Successful in 44s
CI / Publish cortex to COPR (push) Successful in 7m23s
CI / Publish neuron to COPR (push) Successful in 15m58s
CI / Bump version in source (push) Successful in 31s
2026-04-16 17:45:25 +03:00
3f94c50817 chore: move default ports out of common-collision ranges
Previous defaults collided with well-trodden infra services and with
the Linux ephemeral port range:

- cortex API     8000 — common dev-server default (Django, minio UI)
- cortex metrics 9100 — Prometheus node_exporter default
- neuron API     9090 — Cockpit default on Fedora, Prometheus self

Move to helexa-themed palindromic ports, all below Linux's
32768-60999 ephemeral range and not registered to any well-known
service:

- cortex API     31313
- cortex metrics 31314
- neuron API     13131

Updated places:
- cortex.example.toml, neuron.example.toml defaults
- default impls in cortex-core and neuron config
- cortex-cli --endpoint default for the status subcommand
- doc comments citing example URLs
- README.md and CLAUDE.md snippets

Consumers already on the old ports need a one-line edit in their
/etc/cortex/cortex.toml or /etc/neuron/neuron.toml to match;
firewall rules and prometheus scrape configs will also need
updating.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 17:45:25 +03:00
3e1fb60076 ci: drop actions/cache for cargo registry and target
The cache round-trip (download + unpack) was consistently taking
around 6 minutes, noticeably longer than the ~3 minute cold build
it was meant to accelerate. Net-negative on CI time — remove it.

sccache with the S3 backend still provides dep-level caching at a
much lower overhead, so we keep the majority of the cache benefit
without paying the actions/cache tarball cost.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 17:45:25 +03:00
Gitea Actions
9bf987888c chore: bump version to 0.1.14 2026-04-16 16:57:24 +03:00
16 changed files with 98 additions and 75 deletions

View File

@@ -18,51 +18,33 @@ env:
AWS_SECRET_ACCESS_KEY: ${{ secrets.SCCACHE_S3_SECRET_KEY }}
jobs:
check:
name: Format, lint, build, test
runs-on: fedora
fmt:
name: Format
runs-on: rust
steps:
- uses: actions/checkout@v4
- run: cargo fmt --check --all
- name: Cache cargo registry and target
uses: actions/cache@v4
with:
path: |
~/.cargo/bin
~/.cargo/registry/index
~/.cargo/registry/cache
~/.cargo/git/db
target
key: ${{ runner.os }}-cargo-${{ hashFiles('**/Cargo.lock') }}
restore-keys: |
${{ runner.os }}-cargo-
clippy:
name: Clippy
runs-on: rust
steps:
- uses: actions/checkout@v4
- run: cargo clippy --workspace -- -D warnings
- run: sccache --show-stats
- name: Ensure sccache with S3 support
env:
RUSTC_WRAPPER: ""
run: |
if sccache --version 2>/dev/null && sccache --show-stats 2>/dev/null; then
echo "sccache with S3 support already installed"
else
cargo install sccache --features s3 --locked
fi
- name: Check formatting
run: cargo fmt --check --all
- name: Clippy
run: cargo clippy --workspace -- -D warnings
- name: Test
run: cargo test --workspace
- name: Show sccache stats
run: sccache --show-stats
test:
name: Test
runs-on: rust
steps:
- uses: actions/checkout@v4
- run: cargo test --workspace
- run: sccache --show-stats
srpm-cortex:
name: Build cortex SRPM
runs-on: fedora
needs: check
runs-on: rpm
needs: [fmt, clippy, test]
if: startsWith(github.ref, 'refs/tags/v')
steps:
- uses: actions/checkout@v4
@@ -121,8 +103,8 @@ jobs:
srpm-neuron:
name: Build neuron SRPM
runs-on: fedora
needs: check
runs-on: rpm
needs: [fmt, clippy, test]
if: startsWith(github.ref, 'refs/tags/v')
steps:
- uses: actions/checkout@v4
@@ -181,7 +163,7 @@ jobs:
copr-cortex:
name: Publish cortex to COPR
runs-on: fedora
runs-on: fedora-43
needs: srpm-cortex
steps:
- name: Download SRPM
@@ -198,7 +180,7 @@ jobs:
copr-neuron:
name: Publish neuron to COPR
runs-on: fedora
runs-on: fedora-43
needs: srpm-neuron
steps:
- name: Download SRPM
@@ -215,7 +197,7 @@ jobs:
bump-version:
name: Bump version in source
runs-on: fedora
runs-on: rust
needs: [copr-cortex, copr-neuron]
steps:
- uses: actions/checkout@v4

1
.gitignore vendored
View File

@@ -5,3 +5,4 @@
.vscode/
cortex.toml
doc/plan/*
script/deploy.sh

View File

@@ -125,7 +125,8 @@ automatically. Clippy warnings must be resolved, not suppressed with
- One or more GPU nodes running mistral.rs on port 8080
- Optionally a metrics-only node (no GPU) for Prometheus/Grafana
- Each node runs `mistralrs serve` on port 8080
- Gateway listens on port 8000 (API) and 9100 (metrics)
- Gateway listens on port 31313 (API) and 31314 (metrics)
- neuron listens on port 13131 on each GPU host
- TLS terminated at gateway or via nginx; internal traffic is plaintext over WireGuard
## Conventions
@@ -380,7 +381,7 @@ processes (one process per loaded model, each on its own port).
## neuron API
neuron exposes an HTTP API on port 9090 that cortex polls and calls.
neuron exposes an HTTP API on port 13131 that cortex polls and calls.
```
GET /discovery
@@ -424,8 +425,8 @@ endpoint. cortex.toml shrinks to:
```toml
[gateway]
listen = "0.0.0.0:8000"
metrics_listen = "0.0.0.0:9100"
listen = "0.0.0.0:31313"
metrics_listen = "0.0.0.0:31314"
[eviction]
strategy = "lru"
@@ -433,15 +434,15 @@ defrag_after_cycles = 50
[[neurons]]
name = "beast"
endpoint = "http://beast.hanzalova.internal:9090"
endpoint = "http://beast.hanzalova.internal:13131"
[[neurons]]
name = "benjy"
endpoint = "http://benjy.kosherinata.internal:9090"
endpoint = "http://benjy.hanzalova.internal:13131"
[[neurons]]
name = "quadbrat"
endpoint = "http://quadbrat.hanzalova.internal:9090"
endpoint = "http://quadbrat.hanzalova.internal:13131"
```
On startup and periodically, cortex calls `GET /discovery` and
@@ -521,7 +522,7 @@ cortex/
│ │ └── metrics.rs # prometheus exporter (unchanged)
│ ├── neuron/ # node plane (replaces cortex-agent)
│ │ └── src/
│ │ ├── main.rs # binary entrypoint, axum server on :9090
│ │ ├── main.rs # binary entrypoint, axum server on :13131
│ │ ├── discovery.rs # nvidia-smi, device enumeration
│ │ ├── health.rs # runtime GPU polling
│ │ ├── api.rs # HTTP handlers for /discovery, /models, etc.

8
Cargo.lock generated
View File

@@ -351,7 +351,7 @@ checksum = "773648b94d0e5d620f64f280777445740e61fe701025087ec8b57f45c791888b"
[[package]]
name = "cortex-cli"
version = "0.1.12"
version = "0.1.16"
dependencies = [
"anyhow",
"clap",
@@ -366,7 +366,7 @@ dependencies = [
[[package]]
name = "cortex-core"
version = "0.1.12"
version = "0.1.16"
dependencies = [
"anyhow",
"async-trait",
@@ -381,7 +381,7 @@ dependencies = [
[[package]]
name = "cortex-gateway"
version = "0.1.12"
version = "0.1.16"
dependencies = [
"anyhow",
"axum",
@@ -1184,7 +1184,7 @@ dependencies = [
[[package]]
name = "neuron"
version = "0.1.12"
version = "0.1.16"
dependencies = [
"anyhow",
"async-trait",

View File

@@ -8,7 +8,7 @@ members = [
]
[workspace.package]
version = "0.1.12"
version = "0.1.16"
edition = "2024"
license = "GPL-3.0-or-later"
repository = "https://git.lair.cafe/helexa/cortex"

View File

@@ -88,8 +88,8 @@ WantedBy=multi-user.target
```toml
# cortex.toml
[gateway]
listen = "0.0.0.0:8000"
metrics_listen = "0.0.0.0:9100"
listen = "0.0.0.0:31313"
metrics_listen = "0.0.0.0:31314"
[eviction]
strategy = "lru" # lru | priority
@@ -143,7 +143,7 @@ cortex serve --config cortex.toml
cortex status
# list all models across nodes
curl http://localhost:8000/v1/models
curl http://localhost:31313/v1/models
```
## License

View File

@@ -3,11 +3,11 @@
# Copy to cortex.toml and adjust for your environment.
#
# Environment variable overrides use CORTEX_ prefix with __ separators:
# CORTEX_GATEWAY__LISTEN=0.0.0.0:9000
# CORTEX_GATEWAY__LISTEN=0.0.0.0:31313
[gateway]
listen = "0.0.0.0:8000"
metrics_listen = "0.0.0.0:9100"
listen = "0.0.0.0:31313"
metrics_listen = "0.0.0.0:31314"
[eviction]
strategy = "lru"

View File

@@ -1,5 +1,5 @@
Name: cortex
Version: 0.1.12
Version: 0.1.16
Release: 1%{?dist}
Summary: Inference gateway for multi-node GPU clusters
@@ -21,6 +21,7 @@ BuildRequires: systemd-rpm-macros
Requires(pre): shadow-utils
Requires: systemd
Requires: firewalld-filesystem
# systemd-rpm-macros ships a unit dep generator that parses User=/Group=
# from our .service file and emits Requires: user(cortex)/group(cortex).
@@ -56,6 +57,7 @@ cargo build --release -p cortex-cli
install -Dm755 target/release/cortex %{buildroot}%{_bindir}/cortex
install -Dm644 data/cortex.service %{buildroot}%{_unitdir}/cortex.service
install -Dm644 data/cortex-sysusers.conf %{buildroot}%{_sysusersdir}/cortex.conf
install -Dm644 data/cortex-firewalld.xml %{buildroot}%{_prefix}/lib/firewalld/services/cortex.xml
install -dm755 %{buildroot}%{_sysconfdir}/cortex
install -Dm644 cortex.example.toml %{buildroot}%{_sysconfdir}/cortex/cortex.toml
install -Dm644 models.example.toml %{buildroot}%{_sysconfdir}/cortex/models.toml
@@ -78,10 +80,21 @@ install -Dm644 models.example.toml %{buildroot}%{_sysconfdir}/cortex/models.toml
%{_bindir}/cortex
%{_unitdir}/cortex.service
%{_sysusersdir}/cortex.conf
%{_prefix}/lib/firewalld/services/cortex.xml
%dir %{_sysconfdir}/cortex
%config(noreplace) %{_sysconfdir}/cortex/cortex.toml
%config(noreplace) %{_sysconfdir}/cortex/models.toml
%changelog
* Thu Apr 16 2026 Gitea Actions <actions@git.lair.cafe> - 0.1.16-1
- chore: ignore local deploy script
- chore: move default ports out of common-collision ranges
- ci: drop actions/cache for cargo registry and target
* Thu Apr 16 2026 Gitea Actions <actions@git.lair.cafe> - 0.1.14-1
- ci: publish both packages to a single helexa/helexa COPR project
- fix(rpm): rename neuron package to helexa-neuron
- ci: commit generated %changelog entries back to main
* Wed Apr 15 2026 Rob Thijssen <grenade@rob.tn> - 0.1.0-1
- Initial package

View File

@@ -23,7 +23,7 @@ enum Commands {
/// Print the fleet status (models, nodes, health).
Status {
/// Gateway API endpoint to query.
#[arg(short, long, default_value = "http://localhost:8000")]
#[arg(short, long, default_value = "http://localhost:31313")]
endpoint: String,
},
}

View File

@@ -22,9 +22,9 @@ fn default_models_path() -> String {
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct GatewaySettings {
/// Address to listen on for API requests (e.g. "0.0.0.0:8000")
/// Address to listen on for API requests (e.g. "0.0.0.0:31313")
pub listen: String,
/// Address to listen on for Prometheus metrics (e.g. "0.0.0.0:9100")
/// Address to listen on for Prometheus metrics (e.g. "0.0.0.0:31314")
pub metrics_listen: String,
}
@@ -50,7 +50,7 @@ pub enum EvictionStrategy {
pub struct NeuronEndpoint {
/// Human-readable node name (e.g. "beast")
pub name: String,
/// Base URL of the neuron daemon (e.g. "http://beast.internal:9090")
/// Base URL of the neuron daemon (e.g. "http://beast.internal:13131")
pub endpoint: String,
}
@@ -70,8 +70,8 @@ impl Default for GatewayConfig {
fn default() -> Self {
Self {
gateway: GatewaySettings {
listen: "0.0.0.0:8000".into(),
metrics_listen: "0.0.0.0:9100".into(),
listen: "0.0.0.0:31313".into(),
metrics_listen: "0.0.0.0:31314".into(),
},
eviction: EvictionSettings {
strategy: EvictionStrategy::Lru,

View File

@@ -6,7 +6,7 @@ use std::collections::HashMap;
#[derive(Debug, Clone)]
pub struct NodeState {
pub name: String,
/// Base URL of the neuron daemon (e.g. "http://beast.internal:9090").
/// Base URL of the neuron daemon (e.g. "http://beast.internal:13131").
pub endpoint: String,
pub healthy: bool,
pub models: HashMap<String, ModelEntry>,

View File

@@ -17,7 +17,7 @@ pub struct NeuronConfig {
}
fn default_port() -> u16 {
9090
13131
}
impl NeuronConfig {
@@ -33,7 +33,7 @@ impl NeuronConfig {
impl Default for NeuronConfig {
fn default() -> Self {
Self {
port: 9090,
port: 13131,
harnesses: vec![],
}
}

View File

@@ -0,0 +1,7 @@
<?xml version="1.0" encoding="utf-8"?>
<service>
<short>cortex</short>
<description>Cortex — inference gateway for multi-node GPU clusters</description>
<port protocol="tcp" port="31313"/>
<port protocol="tcp" port="31314"/>
</service>

View File

@@ -0,0 +1,6 @@
<?xml version="1.0" encoding="utf-8"?>
<service>
<short>helexa-neuron</short>
<description>Neuron — per-node GPU discovery and harness daemon for cortex</description>
<port protocol="tcp" port="13131"/>
</service>

View File

@@ -1,5 +1,5 @@
Name: helexa-neuron
Version: 0.1.12
Version: 0.1.16
Release: 1%{?dist}
Summary: Per-node GPU discovery and harness management daemon for cortex
# Package name disambiguates from Fedora's existing "neuron" package
@@ -24,6 +24,7 @@ BuildRequires: systemd-rpm-macros
Requires(pre): shadow-utils
Requires: systemd
Requires: firewalld-filesystem
# systemd-rpm-macros ships a unit dep generator that parses User=/Group=
# from our .service file and emits Requires: user(neuron)/group(neuron).
@@ -58,6 +59,7 @@ cargo build --release -p neuron
install -Dm755 target/release/neuron %{buildroot}%{_bindir}/neuron
install -Dm644 data/neuron.service %{buildroot}%{_unitdir}/neuron.service
install -Dm644 data/neuron-sysusers.conf %{buildroot}%{_sysusersdir}/neuron.conf
install -Dm644 data/neuron-firewalld.xml %{buildroot}%{_prefix}/lib/firewalld/services/helexa-neuron.xml
install -dm755 %{buildroot}%{_sysconfdir}/neuron
install -Dm644 neuron.example.toml %{buildroot}%{_sysconfdir}/neuron/neuron.toml
@@ -79,9 +81,20 @@ install -Dm644 neuron.example.toml %{buildroot}%{_sysconfdir}/neuron/neuron.toml
%{_bindir}/neuron
%{_unitdir}/neuron.service
%{_sysusersdir}/neuron.conf
%{_prefix}/lib/firewalld/services/helexa-neuron.xml
%dir %{_sysconfdir}/neuron
%config(noreplace) %{_sysconfdir}/neuron/neuron.toml
%changelog
* Thu Apr 16 2026 Gitea Actions <actions@git.lair.cafe> - 0.1.16-1
- chore: ignore local deploy script
- chore: move default ports out of common-collision ranges
- ci: drop actions/cache for cargo registry and target
* Thu Apr 16 2026 Gitea Actions <actions@git.lair.cafe> - 0.1.14-1
- ci: publish both packages to a single helexa/helexa COPR project
- fix(rpm): rename neuron package to helexa-neuron
- ci: commit generated %changelog entries back to main
* Wed Apr 15 2026 Rob Thijssen <grenade@rob.tn> - 0.1.0-1
- Initial package

View File

@@ -3,9 +3,9 @@
# Copy to /etc/neuron/neuron.toml and adjust for your environment.
#
# Environment variable overrides use NEURON_ prefix with __ separators:
# NEURON_PORT=9090
# NEURON_PORT=13131
port = 9090
port = 13131
# -- Harnesses ---------------------------------------------------------------
# Each [[harnesses]] entry declares an inference engine managed by neuron.