Responses API: implement previous_response_id chained conversations
#4
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Scope cut from Step 2 (commit
957f704)The
/v1/responseshandler currently rejects any request that setsprevious_response_idwith a 400:See
crates/neuron/src/wire/openai_responses.rs::TranslateError::ChainedConversationNotSupportedand the matching test incrates/neuron/tests/api.rs::test_responses_rejects_previous_response_id.Why it was cut
Chained conversations require server-side persistence: when a client sends
previous_response_id: "resp_abc", the agent must look up that prior response's full output (including alloutput_item.*content and tool calls) and prepend it to the new request's input as conversational context. We don't have anywhere to store that today.What implementation looks like
Arc<RwLock<HashMap<ResponseId, StoredResponse>>>onNeuronState, with a TTL evictor. Simple, lossy across restarts.$XDG_DATA_HOME/neuron/responses/<id>.jsonlike helexa-acp's session store. Survives restarts; trivially auditable./v1/responsesrequest (streaming or not), serialise the assembledResponsesResponseto the chosen store, keyed byresponse.id.request_to_chatseesprevious_response_id, fetch the prior response, walk itsoutputitems, and prepend them asassistant/function_call/function_call_outputitems to the new chat-completions message list.Acceptance
previous_response_idset against an unknown id → 404 with a clear error.previous_response_idset against a known id → the model sees the full prior turn as context. Verify by sending a follow-up that depends on the prior assistant message (e.g. "what's my name?" after "my name is Alice").crates/neuron/tests/api.rsexercising the round-trip.Tracking
Blocks: full Responses API parity for clients that use OpenAI's stateful chaining (most production code). Test surface for helexa-acp's eventual openai-responses provider — that provider can either drive chaining client-side (manually feeding prior output back as input) or use this once it lands.