Files
cortex/Cargo.toml
rob thijssen 60f5598542
Some checks failed
build-prerelease / Resolve version stamps (push) Successful in 29s
CI / CUDA type-check (push) Successful in 31s
CI / Format (push) Successful in 35s
CI / Test (push) Failing after 1m9s
CI / Clippy (push) Successful in 2m36s
CI / Build cortex SRPM (push) Has been skipped
CI / Publish cortex to COPR (push) Has been skipped
CI / Build neuron SRPM (push) Has been skipped
CI / Publish neuron to COPR (push) Has been skipped
CI / Bump version in source (push) Has been skipped
build-prerelease / Build neuron-blackwell (push) Successful in 6m10s
build-prerelease / Build neuron-ampere (push) Successful in 7m35s
build-prerelease / Build neuron-ada (push) Successful in 5m7s
build-prerelease / Package helexa-neuron-ampere RPM (push) Successful in 2m53s
build-prerelease / Package helexa-neuron-ada RPM (push) Successful in 3m14s
build-prerelease / Package helexa-neuron-blackwell RPM (push) Successful in 3m48s
build-prerelease / Build cortex binary (push) Successful in 4m33s
build-prerelease / Package cortex RPM (push) Successful in 1m21s
build-prerelease / Publish to rpm.lair.cafe (unstable) (push) Successful in 1m1s
build(neuron): bump cudarc fork to 63327a2 (idempotent abort + Comm Send+Sync)
The fork's new commit makes `Comm: Send + Sync` (asserting NCCL's
thread-safety invariant upstream) and makes `Comm::abort` idempotent via
an `aborted` flag (so abort-then-Drop can't double-free) — strictly
better than the previous Drop-no-panic workaround, and the `abort()`
signature is unchanged so the watchdog call site is unaffected.

Because `Comm` is now `Send + Sync`, `Arc<Comm>` and the `SendComm` /
`NcclState` wrappers auto-derive `Send`/`Sync`, which conflicts (E0119)
with neuron's manual `unsafe impl`s. Remove the four now-redundant impls
— the safety assertion lives upstream in cudarc where it belongs. The
conflict is in cuda-gated code, so only the CUDA type-check catches it
(non-cuda build + clippy + tests stay green).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 16:33:14 +03:00

73 lines
1.9 KiB
TOML

[workspace]
resolver = "2"
members = [
"crates/cortex-core",
"crates/cortex-gateway",
"crates/cortex-cli",
"crates/neuron",
"crates/helexa-acp",
]
[workspace.package]
version = "0.1.16"
edition = "2024"
license = "GPL-3.0-or-later"
repository = "https://git.lair.cafe/helexa/cortex"
[workspace.dependencies]
# async runtime
tokio = { version = "1", features = ["full"] }
# web framework
axum = { version = "0.8", features = ["macros"] }
tower = "0.5"
tower-http = { version = "0.6", features = ["cors", "trace", "timeout"] }
# serialization
serde = { version = "1", features = ["derive"] }
serde_json = "1"
toml = "0.8"
# http client (for proxying to neuron backends)
reqwest = { version = "0.12", features = ["json", "stream"] }
# observability
tracing = "0.1"
tracing-subscriber = { version = "0.3", features = ["env-filter", "json"] }
metrics = "0.24"
metrics-exporter-prometheus = "0.16"
# time
chrono = { version = "0.4", features = ["serde"] }
# config
figment = { version = "0.10", features = ["toml", "env"] }
# error handling
anyhow = "1"
thiserror = "2"
# async traits
async-trait = "0.1"
# CLI
clap = { version = "4", features = ["derive"] }
# futures / streams (for SSE proxying)
futures = "0.3"
tokio-stream = "0.1"
eventsource-stream = "0.2"
# workspace crates
cortex-core = { path = "crates/cortex-core" }
cortex-gateway = { path = "crates/cortex-gateway" }
# Patched cudarc (affects neuron's 0.19.x only; candle's 0.17.x is
# untouched since the fork is 0.19.7 and doesn't satisfy a 0.17 req). Adds
# Comm::abort / get_async_error / raw comm() — needed for #17 Stage 2 TP
# hang-recovery (abort a wedged collective from another thread, then
# rebuild the comm). Pinned to a fork revision pending upstream review
# (grenade/cudarc @ nccl-comm-abort).
[patch.crates-io]
cudarc = { git = "https://github.com/grenade/cudarc", rev = "63327a256059f8252641ae46c6bb9eefe707f382" }