Overview
This document describes all parameters available in the/v1/chat/completions endpoint. Parameters are validated server-side; unsupported parameters for a given provider are ignored.
Required Parameters
model
Type: stringRequired: Yes
Description: Model identifier in format
provider/model-name
Examples:
openai/gpt-5-minianthropic/claude-3-sonnetopenai/gpt-3.5-turbo
messages
Type: Message[]Required: Yes
Minimum: 1 message
Description: Array of message objects with
role and content
Message roles:
system: System instructions (typically first message)user: User inputassistant: Model responses (for conversation history)tool: Tool execution results
Sampling Parameters
temperature
Type: numberRange:
0.0 - 2.0Default:
1.0Description: Controls randomness. Lower values make output more deterministic.
0.0: Most deterministic1.0: Balanced2.0: Most random
top_p
Type: numberRange:
0.0 - 1.0Default:
1.0Description: Nucleus sampling - considers tokens with cumulative probability mass.
max_tokens
Type: integerMinimum:
1Description: Maximum tokens to generate. Model-specific limits apply.
max_completion_tokens
Type: integerDescription: Alternative to
max_tokens (provider-specific).
stop
Type: string | string[]Description: Stop sequences that halt generation. Can be a single string or array.
seed
Type: integerDescription: Random seed for deterministic outputs. Only supported by some models.
Penalties
frequency_penalty
Type: numberRange:
-2.0 - 2.0Description: Reduces likelihood of repeating tokens. Positive values decrease repetition.
presence_penalty
Type: numberRange:
-2.0 - 2.0Description: Reduces likelihood of discussing new topics. Positive values encourage new topics.
repetition_penalty
Type: numberRange:
(0, 2]Description: Provider-specific repetition control.
Advanced Sampling
top_k
Type: integerMinimum:
0Description: Limits sampling to top K tokens by probability.
Not allowed with reasoning: When
reasoning is enabled, top_k is not supported and will result in an error.top_a
Type: numberRange:
[0, 1]Description: Provider-specific sampling parameter.
min_p
Type: numberRange:
[0, 1]Description: Minimum probability threshold for token selection.
logit_bias
Type: { [token_id: number]: number }Description: Bias specific tokens by ID. Values typically range from -100 to 100.
Tool Calling
tools
Type: Tool[]Description: Array of function definitions for tool calling.
tool_choice
Type: "none" | "auto" | "required" | { type: "function", function: { name: string } }Default:
"auto"Description: Controls tool usage behavior.
"none": No tools called"auto": Model decides"required": Model must call a tool- Object: Force specific function
Restrictions with reasoning: When
reasoning is enabled, only "auto" or "none" are allowed. Using "required" or forcing a specific tool will result in an error. This restriction enables interleaved thinking, which allows reasoning between tool calls.parallel_tool_calls
Type: booleanDefault:
trueDescription: Allow multiple tool calls in a single response.
Structured Outputs
response_format
Type: { type: "json_object" } | { type: "json_schema", json_schema: object }Description: Enforce JSON output format. JSON Object:
Streaming
stream
Type: booleanDefault:
falseDescription: Enable Server-Sent Events streaming. See Streaming documentation.
Reasoning
reasoning
Type: objectDescription: Configure reasoning for models that support it.
Check Reasoning Support
For models that support reasoning and their configuration options, visit anannas.ai/models.
Interleaved Thinking with ToolsWhen using reasoning with tools on Claude 4 models (Sonnet 4.5, Opus 4.5, Haiku 4.5), interleaved thinking is automatically enabled. This allows the model to reason between tool calls. With interleaved thinking,
max_tokens in the reasoning config can exceed the request’s max_tokens parameter, as it represents the total budget across all thinking blocks within one assistant turn.Interleaved thinking requires tool_choice: "auto" (or no tool_choice specified). Using tool_choice: "required" or forcing a specific tool will result in an error when reasoning is enabled.thinking_config
Type: objectDescription: External API alias for
reasoning. Maps to reasoning internally.
Multimodal
modalities
Type: string[]Description: Requested output modalities:
["text", "audio", "image"]
audio
Type: objectDescription: Audio output configuration.
Prompt Caching
prompt_cache_key
Type: stringDescription: Cache key for OpenAI prompt caching. See Overview.
Routing
models
Type: string[]Description: Fallback model list for routing.
route
Type: "fallback"Description: Enable smart fallback routing.
provider
Type: objectDescription: Provider selection preferences.
Check Provider Pricing
For current pricing to set
max_price limits, visit anannas.ai/models.fallbacks
Type: Array<string | object>Description: Explicit fallback chain.
User Tracking
user
Type: stringDescription: Stable identifier for end-users (abuse prevention).
Metadata
metadata
Type: { [key: string]: string }Description: Custom metadata for request tracking.
Provider-Specific Parameters
Check Parameter Support
For detailed parameter support by model and provider, visit anannas.ai/models.
- Mistral:
safe_prompt - Hyperbolic:
raw_mode - Grok:
search_parameters,deferred - Anthropic:
cache_controlin message content
Parameter Validation
- Invalid parameter values return
400 Bad Request - Unsupported parameters are silently ignored
- Provider-specific parameters are passed through when supported
See Also
- API Overview - Request/response schemas
- Streaming - Streaming implementation
- Models - Model capabilities and parameter support