POST /v1/chat/stream

The primary endpoint. Send a message, get an AI response streamed back as Server-Sent Events. Pass a conversation id to keep the thread alive across requests. Available on all plans.

Endpoint

http
POST https://api.sentientone.ai/v1/chat/stream

Headers

  • X-Api-KeyYour platform key. See Authentication.
  • X-Agent-IdUUID of the agent to invoke.
  • Content-TypeAlways application/json.

Request body

  • message (required)The user's message. Appended to the conversation history and sent to the LLM along with the agent's system prompt.
  • conversation_id (optional)Pass an existing id to continue a thread. Omit it and a new conversation is created automatically.
  • prompt (optional)Override the agent's configured system prompt for this request.

Example request

bash
curl -N -X POST https://api.sentientone.ai/v1/chat/stream \
  -H "Content-Type: application/json" \
  -H "X-Api-Key: sk-so-YOUR_KEY" \
  -H "X-Agent-Id: YOUR_AGENT_ID" \
  -d '{
    "message": "Get the order details for OrderId: ORD-1234"
  }'

The -N flag disables curl's output buffering so each event prints the moment it arrives.

Response (200 OK) — Server-Sent Events

The response is Content-Type: text/event-stream. Read it line-by-line; each event is a JSON payload on its own data: line. The full answer is the concatenation of every delta event's content.

text
data: {"type":"meta","conversation_id":"conv-uuid-...","trace_id":"trace-uuid-..."}

data: {"type":"sources","sources":[{"index":1,"id":"doc-...","title":"Order ORD-1234","url":"https://...","source_type":"order","score":0.91,"snippet":"shipped via FedEx..."}]}

data: {"type":"delta","content":"Order ORD-1234 "}

data: {"type":"delta","content":"is shipped via FedEx "}

data: {"type":"delta","content":"(FX-998877)."}

data: {"type":"done","conversation_id":"conv-uuid-...","trace_id":"trace-uuid-..."}
  • metaFirst event. Carries the conversation_id and trace_id — store the conversation id now to keep history across requests.
  • sourcesEmitted when retrieval ran. Each source has index, id, title, url, source_type, score, and snippet.
  • deltaIncremental token chunk. Concatenate every content value in order to build the full reply text.
  • doneFinal event. Carries the conversation_id and trace_id; the stream closes after it.
  • errorEmitted when something goes wrong. Carries an error code and message.

What happens server-side

  1. 1

    Authenticate

    X-Api-Key resolves to your account; rate limits and plan checks fire.
  2. 2

    Load the agent

    X-Agent-Id selects the agent config — system prompt, model, provider key, parameters.
  3. 3

    Resolve conversation

    If you passed conversation_id, history is loaded. Otherwise a new conversation is created.
  4. 4

    Persist your message

    The new user message is saved to the conversation history before any LLM work starts.
  5. 5

    Call the LLM

    System prompt + all prior messages + new message are sent to the provider.
  6. 6

    Run MCP tools if requested

    If the model asks for a tool, the platform runs it and feeds the result back (up to 8 rounds by default — configurable per agent).
  7. 7

    Stream and save

    Tokens stream back as delta events while they generate; the final reply is saved and the stream closes with a done event.

Multi-turn conversations

The first request returns a conversation_id. Pass it back on every subsequent request and the agent sees the full prior thread — same model context window as a single long conversation.

bash
# 1) Start a thread
curl -N -X POST https://api.sentientone.ai/v1/chat/stream \
  -H "X-Api-Key: sk-so-..." -H "X-Agent-Id: ..." \
  -H "Content-Type: application/json" \
  -d '{ "message": "Hi, my order ORD-1234 hasn't arrived." }'
# → data: {"type":"meta","conversation_id":"conv-abc123","trace_id":"..."}
#   data: {"type":"delta","content":"..."}  (repeated)
#   data: {"type":"done","conversation_id":"conv-abc123","trace_id":"..."}

# 2) Continue it
curl -N -X POST https://api.sentientone.ai/v1/chat/stream \
  -H "X-Api-Key: sk-so-..." -H "X-Agent-Id: ..." \
  -H "Content-Type: application/json" \
  -d '{ "conversation_id": "conv-abc123", "message": "What carrier is it on?" }'

Working with the stream

Every reply arrives as Server-Sent Events — there is no single-JSON mode. See Streaming for the full event reference and a client read loop, including how to concatenate delta content into the complete answer.