Skip to main content
The Anannas API supports streaming responses from any model, including OpenAI and Anthropic. This is useful for building chat interfaces where the UI updates as the model generates output in real time. To enable streaming, set the stream parameter to true in your request. Instead of waiting for the full completion, the API will stream chunks of the response.

Example Requests

OpenAI :
{
  "model": "openai/gpt-3.5-turbo",
  "messages": [
    {
      "role": "user",
      "content": "Write a short story about a robot learning to paint."
    }
  ],
  "stream": true,
  "max_tokens": 500
}
Anthropic:
{
  "model": "anthropic/claude-3-5-sonnet-20241022",
  "messages": [
    {
      "role": "user",
      "content": "Explain quantum computing to a 10-year-old."
    }
  ],
  "stream": true,
  "max_tokens": 300

Additional Information

  • Anannas uses SSE (Server-Sent Events) for streaming.
  • You may occasionally see comments like:
: ANANNAS 
These can be ignored per the SSE spec.

Cancelling Streams

You can cancel a streaming request by closing the connection:
  • In Python, close the request stream.
  • In TypeScript, call AbortController.abort().
This stops billing immediately for supported providers.
Was this page helpful?