This is the complete REST API reference for Swfte Connect. All endpoints are accessible at https://connect.swfte.com/v2/gateway/ with your API key.
Authentication
All requests require an API key passed in the Authorization header:
curl https://connect.swfte.com/v2/gateway/chat/completions \
-H "Authorization: Bearer sk-swfte-your-key-here" \
-H "Content-Type: application/json" \
-d '{"model": "openai:gpt-5", "messages": [{"role": "user", "content": "Hello"}]}'
API keys are scoped to a workspace and can be restricted to specific endpoints. Create and manage keys at connect.swfte.com or via the /api-keys endpoint.
Key Format
Keys follow the pattern sk-swfte-{random}. They are 48 characters long and case-sensitive.
Key Scopes
| Scope | Endpoints Allowed |
|---|---|
all | All endpoints |
chat | /v2/gateway/chat/completions only |
agents | Agent-related endpoints only |
embeddings | /v2/gateway/embeddings only |
Base URL
https://connect.swfte.com/v2/gateway
For self-hosted deployments, replace with your gateway URL.
Chat Completions
Create Chat Completion
POST /v2/gateway/chat/completions
Request Body:
{
"model": "openai:gpt-5",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the capital of France?"}
],
"max_tokens": 256,
"temperature": 0.7,
"top_p": 1.0,
"n": 1,
"stream": false,
"stop": null,
"presence_penalty": 0,
"frequency_penalty": 0,
"tools": null,
"tool_choice": null,
"swfte_options": {
"fallback_models": ["anthropic:claude-sonnet-4"],
"routing": "lowest_latency",
"timeout_ms": 30000
}
}
Parameters:
| Field | Type | Required | Description |
|---|---|---|---|
model | string | Yes | Model identifier in provider:model format |
messages | array | Yes | Array of message objects with role and content |
max_tokens | integer | No | Maximum tokens to generate (default: model-specific) |
temperature | number | No | Sampling temperature 0-2 (default: 1.0) |
top_p | number | No | Nucleus sampling parameter (default: 1.0) |
n | integer | No | Number of completions to generate (default: 1) |
stream | boolean | No | Stream response chunks (default: false) |
stop | string/array | No | Stop sequences |
tools | array | No | Tool/function definitions |
tool_choice | string/object | No | Tool selection strategy |
swfte_options | object | No | Swfte-specific routing and control options |
Response:
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1711648000,
"model": "gpt-5",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "The capital of France is Paris."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 25,
"completion_tokens": 8,
"total_tokens": 33
},
"swfte_metadata": {
"cost": 0.000495,
"latency_ms": 234,
"provider": "openai",
"cached": false,
"request_id": "req_xyz789"
}
}
Streaming
When stream: true, the response is delivered as Server-Sent Events (SSE):
curl https://connect.swfte.com/v2/gateway/chat/completions \
-H "Authorization: Bearer sk-swfte-..." \
-H "Content-Type: application/json" \
-d '{"model": "openai:gpt-5", "messages": [{"role": "user", "content": "Hello"}], "stream": true}'
Each chunk follows this format:
data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":"The"},"finish_reason":null}]}
data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":" capital"},"finish_reason":null}]}
data: [DONE]
Embeddings
Create Embedding
POST /v2/gateway/embeddings
Request Body:
{
"model": "openai:text-embedding-3-large",
"input": ["First text to embed", "Second text to embed"],
"encoding_format": "float",
"dimensions": 1536
}
Parameters:
| Field | Type | Required | Description |
|---|---|---|---|
model | string | Yes | Embedding model identifier |
input | string/array | Yes | Text(s) to embed |
encoding_format | string | No | float or base64 (default: float) |
dimensions | integer | No | Output dimension (model-dependent) |
Response:
{
"object": "list",
"data": [
{
"object": "embedding",
"embedding": [0.0023, -0.0091, 0.0154, ...],
"index": 0
}
],
"model": "text-embedding-3-large",
"usage": {
"prompt_tokens": 12,
"total_tokens": 12
}
}
Models
List Available Models
GET /v2/gateway/models
Returns all models available through your connected providers.
Response:
{
"object": "list",
"data": [
{
"id": "openai:gpt-5",
"object": "model",
"provider": "openai",
"name": "GPT-5",
"type": "TEXT_GENERATION",
"context_window": 128000,
"max_output_tokens": 16384,
"pricing": {
"input_per_1m_tokens": 5.00,
"output_per_1m_tokens": 15.00
}
}
]
}
API Keys
List Keys
GET /api/dashboard/api-keys
Response:
{
"keys": [
{
"id": "key_abc123",
"name": "production-backend",
"keyPrefix": "sk-swfte-abc...",
"enabled": true,
"totalRequests": 15234,
"totalTokens": 4521000,
"totalCost": 67.83,
"createdAt": "2026-01-15T10:00:00Z",
"lastUsedAt": "2026-03-28T14:30:00Z",
"status": "healthy"
}
],
"count": 1,
"activeCount": 1,
"totalRequests": 15234
}
Create Key
POST /api/api-keys
Request Body:
{
"name": "my-new-key",
"scope": "all",
"rateLimit": {
"requestsPerMinute": 60
}
}
Delete Key
DELETE /api/api-keys/{keyId}
Billing
Get Credit Balance
GET /api/billing/credits
Response:
{
"balance": 142.50,
"currency": "USD",
"low_balance_warning": false,
"auto_reload_enabled": true,
"free_tier": {
"has_free_tier": true,
"limit": 10.00,
"remaining": 3.50
}
}
Get Usage Summary
GET /api/billing/summary
Returns aggregate usage metrics for the current billing period.
Swfte Options
The swfte_options object in chat completion requests controls gateway behavior:
| Field | Type | Description |
|---|---|---|
fallback_models | string[] | Ordered list of fallback models |
routing | string | lowest_latency, lowest_cost, or weighted |
eligible_providers | string[] | Restrict routing to specific providers |
weights | object | Provider weights for weighted routing |
timeout_ms | integer | Request timeout in milliseconds |
max_retries | integer | Maximum retry attempts |
retry_delay_ms | integer | Delay between retries |
cache | boolean | Enable response caching |
cache_ttl_s | integer | Cache time-to-live in seconds |
Error Codes
| Status | Code | Description |
|---|---|---|
| 400 | invalid_request | Malformed request body or missing required fields |
| 401 | invalid_api_key | API key is missing, invalid, or expired |
| 403 | insufficient_scope | API key does not have permission for this endpoint |
| 404 | model_not_found | Requested model is not available |
| 429 | rate_limit_exceeded | Rate limit hit; check Retry-After header |
| 402 | insufficient_credits | Credit balance is zero |
| 500 | internal_error | Gateway internal error |
| 502 | provider_error | Upstream provider returned an error |
| 503 | provider_unavailable | All providers in the routing chain are unavailable |
| 504 | timeout | Request exceeded the configured timeout |
Error Response Format:
{
"error": {
"code": "rate_limit_exceeded",
"message": "Rate limit exceeded for key key_abc123. Limit: 60 req/min.",
"type": "rate_limit_error",
"retry_after": 12
}
}
Rate Limits
Default rate limits per API key:
| Metric | Free Tier | Pro | Enterprise |
|---|---|---|---|
| Requests per minute | 20 | 120 | Custom |
| Tokens per minute | 40,000 | 500,000 | Custom |
| Concurrent requests | 5 | 25 | Custom |
Rate limit headers are included in every response:
X-RateLimit-Limit: 120
X-RateLimit-Remaining: 118
X-RateLimit-Reset: 1711648060
SDKs
Official SDKs handle authentication, retries, streaming, and error parsing:
- Python:
pip install swfte-sdk-- SDK Guide - JavaScript/TypeScript:
npm install swfte-sdk-- SDK Guide - Java:
com.swfte:swfte-sdk:1.0.0-- SDK Guide
Next Steps
- Getting Started -- First-time setup
- SDK Guide -- Full SDK reference
- Multi-Provider Routing -- Routing strategies
- Cost Optimization -- Budget controls