POST /v1/chat/stream
The primary endpoint. Send a message, get an AI response streamed back as Server-Sent Events. Pass a conversation id to keep the thread alive across requests. Available on all plans.
Endpoint
POST https://api.sentientone.ai/v1/chat/streamHeaders
- X-Api-KeyYour platform key. See Authentication.
- X-Agent-IdUUID of the agent to invoke.
- Content-TypeAlways
application/json.
Request body
- message (required)The user's message. Appended to the conversation history and sent to the LLM along with the agent's system prompt.
- conversation_id (optional)Pass an existing id to continue a thread. Omit it and a new conversation is created automatically.
- prompt (optional)Override the agent's configured system prompt for this request.
Example request
curl -N -X POST https://api.sentientone.ai/v1/chat/stream \
-H "Content-Type: application/json" \
-H "X-Api-Key: sk-so-YOUR_KEY" \
-H "X-Agent-Id: YOUR_AGENT_ID" \
-d '{
"message": "Get the order details for OrderId: ORD-1234"
}'The -N flag disables curl's output buffering so each event prints the moment it arrives.
Response (200 OK) — Server-Sent Events
The response is Content-Type: text/event-stream. Read it line-by-line; each event is a JSON payload on its own data: line. The full answer is the concatenation of every delta event's content.
data: {"type":"meta","conversation_id":"conv-uuid-...","trace_id":"trace-uuid-..."}
data: {"type":"sources","sources":[{"index":1,"id":"doc-...","title":"Order ORD-1234","url":"https://...","source_type":"order","score":0.91,"snippet":"shipped via FedEx..."}]}
data: {"type":"delta","content":"Order ORD-1234 "}
data: {"type":"delta","content":"is shipped via FedEx "}
data: {"type":"delta","content":"(FX-998877)."}
data: {"type":"done","conversation_id":"conv-uuid-...","trace_id":"trace-uuid-..."}- metaFirst event. Carries the
conversation_idandtrace_id— store the conversation id now to keep history across requests. - sourcesEmitted when retrieval ran. Each source has
index,id,title,url,source_type,score, andsnippet. - deltaIncremental token chunk. Concatenate every
contentvalue in order to build the full reply text. - doneFinal event. Carries the
conversation_idandtrace_id; the stream closes after it. - errorEmitted when something goes wrong. Carries an error
codeandmessage.
What happens server-side
- 1
Authenticate
X-Api-Keyresolves to your account; rate limits and plan checks fire. - 2
Load the agent
X-Agent-Idselects the agent config — system prompt, model, provider key, parameters. - 3
Resolve conversation
If you passedconversation_id, history is loaded. Otherwise a new conversation is created. - 4
Persist your message
The new user message is saved to the conversation history before any LLM work starts. - 5
Call the LLM
System prompt + all prior messages + new message are sent to the provider. - 6
Run MCP tools if requested
If the model asks for a tool, the platform runs it and feeds the result back (up to 8 rounds by default — configurable per agent). - 7
Stream and save
Tokens stream back asdeltaevents while they generate; the final reply is saved and the stream closes with adoneevent.
Multi-turn conversations
The first request returns a conversation_id. Pass it back on every subsequent request and the agent sees the full prior thread — same model context window as a single long conversation.
# 1) Start a thread
curl -N -X POST https://api.sentientone.ai/v1/chat/stream \
-H "X-Api-Key: sk-so-..." -H "X-Agent-Id: ..." \
-H "Content-Type: application/json" \
-d '{ "message": "Hi, my order ORD-1234 hasn't arrived." }'
# → data: {"type":"meta","conversation_id":"conv-abc123","trace_id":"..."}
# data: {"type":"delta","content":"..."} (repeated)
# data: {"type":"done","conversation_id":"conv-abc123","trace_id":"..."}
# 2) Continue it
curl -N -X POST https://api.sentientone.ai/v1/chat/stream \
-H "X-Api-Key: sk-so-..." -H "X-Agent-Id: ..." \
-H "Content-Type: application/json" \
-d '{ "conversation_id": "conv-abc123", "message": "What carrier is it on?" }'Working with the stream
delta content into the complete answer.