Skip to content

Streaming Responses

Streaming lets your application display model output as it arrives. Supado uses OpenAI-compatible streaming behavior for supported endpoints and models.

  • A non-streaming request works through Supado.
  • Your model and endpoint support streaming.
  • Your HTTP client can read Server-Sent Events incrementally.
  • Any reverse proxy in front of your app does not buffer responses.
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.SUPADO_API_KEY,
baseURL: "https://supado.com/v1",
});
const stream = await client.chat.completions.create({
model: "gpt-4o-mini",
stream: true,
messages: [
{ role: "user", content: "Write a five-word deployment checklist." },
],
});
for await (const event of stream) {
const token = event.choices[0]?.delta?.content;
if (token) {
process.stdout.write(token);
}
}
Terminal window
curl -N https://supado.com/v1/chat/completions \
-H "Authorization: Bearer $SUPADO_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o-mini",
"stream": true,
"messages": [
{ "role": "user", "content": "Write a five-word deployment checklist." }
]
}'

The -N flag disables curl’s output buffering so you can see chunks as they arrive.

Streaming should show tokens incrementally in the terminal or UI. Your client should handle normal completion, errors before or during streaming, and timeouts long enough for the selected model and prompt.

If streaming looks like a normal delayed response, test with curl -N from the same network path. If curl streams but your app does not, inspect your client code. If curl also buffers, inspect the proxy path.

Start with one streaming endpoint and one model before expanding to every chat path in your product. If output arrives all at once, inspect proxy buffering first. If the UI hangs after the final token, inspect stream completion handling.