Streaming Responses

Streaming lets your application display model output as it arrives. Supado uses OpenAI-compatible streaming behavior for supported endpoints and models.

Before you start

A non-streaming request works through Supado.
Your model and endpoint support streaming.
Your HTTP client can read Server-Sent Events incrementally.
Any reverse proxy in front of your app does not buffer responses.

Example

TypeScript OpenAI SDK

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.SUPADO_API_KEY,
  baseURL: "https://supado.com/v1",
});

const stream = await client.chat.completions.create({
  model: "gpt-4o-mini",
  stream: true,
  messages: [
    { role: "user", content: "Write a five-word deployment checklist." },
  ],
});

for await (const event of stream) {
  const token = event.choices[0]?.delta?.content;
  if (token) {
    process.stdout.write(token);
  }
}

curl

curl -N https://supado.com/v1/chat/completions \
  -H "Authorization: Bearer $SUPADO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini",
    "stream": true,
    "messages": [
      { "role": "user", "content": "Write a five-word deployment checklist." }
    ]
  }'

The -N flag disables curl’s output buffering so you can see chunks as they arrive.

Streaming behavior

Streaming should show tokens incrementally in the terminal or UI. Your client should handle normal completion, errors before or during streaming, and timeouts long enough for the selected model and prompt.

If streaming looks like a normal delayed response, test with curl -N from the same network path. If curl streams but your app does not, inspect your client code. If curl also buffers, inspect the proxy path.

Start with one streaming endpoint and one model before expanding to every chat path in your product. If output arrives all at once, inspect proxy buffering first. If the UI hangs after the final token, inspect stream completion handling.