• Joined on 2026-02-17
grenade pushed to main at helexa/cortex 2026-05-22 04:17:29 +00:00
aa88d37509 fix(gateway): full observability + stop leaking upstream bodies
grenade pushed to main at helexa/cortex 2026-05-22 04:10:44 +00:00
0f00f72b47 fix(router,handlers): strip trailing slash from rewritten URL + log upstream failures
grenade pushed to main at helexa/cortex 2026-05-22 03:23:52 +00:00
9b0ed0b57f fix(router): rewrite loopback inference URLs to use neuron's host
grenade pushed to main at helexa/cortex 2026-05-22 03:12:54 +00:00
dc2a803266 fix(rpm): migrate legacy helexa-cortex firewalld service to cortex
grenade pushed to main at helexa/cortex 2026-05-21 18:53:18 +00:00
e71181499e feat(stage-8e-3): quantize lm_head in TP Qwen3-Next
grenade pushed to main at helexa/cortex 2026-05-21 18:50:49 +00:00
ee663e5e99 fix(stage-8e-2e): bump quant prefill threshold to M > 64
grenade pushed to main at helexa/cortex 2026-05-21 18:15:36 +00:00
34f9b77d9d feat(stage-8e-2d): route quantized matmul by M (prefill vs decode)
grenade pushed to main at helexa/cortex 2026-05-21 17:05:23 +00:00
f084aaab8e fix(stage-8e-2c): cast bf16/f16 activations to f32 around QMatMul
grenade pushed to main at helexa/cortex 2026-05-21 16:17:17 +00:00
68a606a79c fix(stage-8e-2b): allow quant on the TP load path
grenade pushed to main at helexa/cortex 2026-05-21 15:03:41 +00:00
4aa71902d0 feat(stage-8e-2): plumb quant config from ModelSpec to TP load path
grenade pushed to main at helexa/cortex 2026-05-21 14:55:29 +00:00
bef159b21c feat(stage-8e-1): MaybeQuantLinear primitive + parallel-linear quant variants
grenade pushed to main at helexa/cortex 2026-05-21 14:49:38 +00:00
8d7b099b36 feat(stage-8d-7): direct safetensors fused-region loader
grenade pushed to main at helexa/cortex 2026-05-21 09:54:08 +00:00
89d98d1fb2 diag(stage-8d-6): per-layer VRAM logging in TP load path
grenade pushed to main at helexa/cortex 2026-05-21 08:52:41 +00:00
cc95fe28d9 feat(stage-8d-5b): wire fused_gdn_gating CUDA kernel
grenade pushed to main at helexa/cortex 2026-05-21 08:50:34 +00:00
09c945f81e feat(stage-8d-4): dispatch chunked_gated_delta_rule_recurrence at prefill
grenade pushed to main at helexa/cortex 2026-05-21 08:49:45 +00:00
05dc0bad18 feat(stage-8d-3): wire causal_conv1d_update/full CUDA kernels
grenade pushed to main at helexa/cortex 2026-05-21 08:44:18 +00:00
10c151efa5 feat(stage-8d-5): wire gated_delta_rule_recurrence kernel into tp_qwen3_5
grenade pushed to main at helexa/cortex 2026-05-21 08:39:35 +00:00
44ae927e38 feat(stage-8d-2): wire gated_delta_rule_recurrence kernel into qwen3_5
grenade pushed to main at helexa/cortex 2026-05-21 08:34:15 +00:00
1ebbe87651 feat(stage-8d-1): import mistralrs GDN CUDA kernels — build infra only
grenade pushed to main at helexa/cortex 2026-05-21 05:22:05 +00:00
70eb6af42b feat(tp): cancellation-safe inference + structured tracing