|
English

This is the complete REST API reference for Swfte Connect. All endpoints are accessible at https://connect.swfte.com/v2/gateway/ with your API key.


Authentication

All requests require an API key passed in the Authorization header:

curl https://connect.swfte.com/v2/gateway/chat/completions \
  -H "Authorization: Bearer sk-swfte-your-key-here" \
  -H "Content-Type: application/json" \
  -d '{"model": "openai:gpt-5", "messages": [{"role": "user", "content": "Hello"}]}'

API keys are scoped to a workspace and can be restricted to specific endpoints. Create and manage keys at connect.swfte.com or via the /api-keys endpoint.

Key Format

Keys follow the pattern sk-swfte-{random}. They are 48 characters long and case-sensitive.

Key Scopes

ScopeEndpoints Allowed
allAll endpoints
chat/v2/gateway/chat/completions only
agentsAgent-related endpoints only
embeddings/v2/gateway/embeddings only

Base URL

https://connect.swfte.com/v2/gateway

For self-hosted deployments, replace with your gateway URL.


Chat Completions

Create Chat Completion

POST /v2/gateway/chat/completions

Request Body:

{
  "model": "openai:gpt-5",
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is the capital of France?"}
  ],
  "max_tokens": 256,
  "temperature": 0.7,
  "top_p": 1.0,
  "n": 1,
  "stream": false,
  "stop": null,
  "presence_penalty": 0,
  "frequency_penalty": 0,
  "tools": null,
  "tool_choice": null,
  "swfte_options": {
    "fallback_models": ["anthropic:claude-sonnet-4"],
    "routing": "lowest_latency",
    "timeout_ms": 30000
  }
}

Parameters:

FieldTypeRequiredDescription
modelstringYesModel identifier in provider:model format
messagesarrayYesArray of message objects with role and content
max_tokensintegerNoMaximum tokens to generate (default: model-specific)
temperaturenumberNoSampling temperature 0-2 (default: 1.0)
top_pnumberNoNucleus sampling parameter (default: 1.0)
nintegerNoNumber of completions to generate (default: 1)
streambooleanNoStream response chunks (default: false)
stopstring/arrayNoStop sequences
toolsarrayNoTool/function definitions
tool_choicestring/objectNoTool selection strategy
swfte_optionsobjectNoSwfte-specific routing and control options

Response:

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1711648000,
  "model": "gpt-5",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The capital of France is Paris."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 8,
    "total_tokens": 33
  },
  "swfte_metadata": {
    "cost": 0.000495,
    "latency_ms": 234,
    "provider": "openai",
    "cached": false,
    "request_id": "req_xyz789"
  }
}

Streaming

When stream: true, the response is delivered as Server-Sent Events (SSE):

curl https://connect.swfte.com/v2/gateway/chat/completions \
  -H "Authorization: Bearer sk-swfte-..." \
  -H "Content-Type: application/json" \
  -d '{"model": "openai:gpt-5", "messages": [{"role": "user", "content": "Hello"}], "stream": true}'

Each chunk follows this format:

data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":"The"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":" capital"},"finish_reason":null}]}

data: [DONE]

Embeddings

Create Embedding

POST /v2/gateway/embeddings

Request Body:

{
  "model": "openai:text-embedding-3-large",
  "input": ["First text to embed", "Second text to embed"],
  "encoding_format": "float",
  "dimensions": 1536
}

Parameters:

FieldTypeRequiredDescription
modelstringYesEmbedding model identifier
inputstring/arrayYesText(s) to embed
encoding_formatstringNofloat or base64 (default: float)
dimensionsintegerNoOutput dimension (model-dependent)

Response:

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "embedding": [0.0023, -0.0091, 0.0154, ...],
      "index": 0
    }
  ],
  "model": "text-embedding-3-large",
  "usage": {
    "prompt_tokens": 12,
    "total_tokens": 12
  }
}

Models

List Available Models

GET /v2/gateway/models

Returns all models available through your connected providers.

Response:

{
  "object": "list",
  "data": [
    {
      "id": "openai:gpt-5",
      "object": "model",
      "provider": "openai",
      "name": "GPT-5",
      "type": "TEXT_GENERATION",
      "context_window": 128000,
      "max_output_tokens": 16384,
      "pricing": {
        "input_per_1m_tokens": 5.00,
        "output_per_1m_tokens": 15.00
      }
    }
  ]
}

API Keys

List Keys

GET /api/dashboard/api-keys

Response:

{
  "keys": [
    {
      "id": "key_abc123",
      "name": "production-backend",
      "keyPrefix": "sk-swfte-abc...",
      "enabled": true,
      "totalRequests": 15234,
      "totalTokens": 4521000,
      "totalCost": 67.83,
      "createdAt": "2026-01-15T10:00:00Z",
      "lastUsedAt": "2026-03-28T14:30:00Z",
      "status": "healthy"
    }
  ],
  "count": 1,
  "activeCount": 1,
  "totalRequests": 15234
}

Create Key

POST /api/api-keys

Request Body:

{
  "name": "my-new-key",
  "scope": "all",
  "rateLimit": {
    "requestsPerMinute": 60
  }
}

Delete Key

DELETE /api/api-keys/{keyId}

Billing

Get Credit Balance

GET /api/billing/credits

Response:

{
  "balance": 142.50,
  "currency": "USD",
  "low_balance_warning": false,
  "auto_reload_enabled": true,
  "free_tier": {
    "has_free_tier": true,
    "limit": 10.00,
    "remaining": 3.50
  }
}

Get Usage Summary

GET /api/billing/summary

Returns aggregate usage metrics for the current billing period.


Swfte Options

The swfte_options object in chat completion requests controls gateway behavior:

FieldTypeDescription
fallback_modelsstring[]Ordered list of fallback models
routingstringlowest_latency, lowest_cost, or weighted
eligible_providersstring[]Restrict routing to specific providers
weightsobjectProvider weights for weighted routing
timeout_msintegerRequest timeout in milliseconds
max_retriesintegerMaximum retry attempts
retry_delay_msintegerDelay between retries
cachebooleanEnable response caching
cache_ttl_sintegerCache time-to-live in seconds

Error Codes

StatusCodeDescription
400invalid_requestMalformed request body or missing required fields
401invalid_api_keyAPI key is missing, invalid, or expired
403insufficient_scopeAPI key does not have permission for this endpoint
404model_not_foundRequested model is not available
429rate_limit_exceededRate limit hit; check Retry-After header
402insufficient_creditsCredit balance is zero
500internal_errorGateway internal error
502provider_errorUpstream provider returned an error
503provider_unavailableAll providers in the routing chain are unavailable
504timeoutRequest exceeded the configured timeout

Error Response Format:

{
  "error": {
    "code": "rate_limit_exceeded",
    "message": "Rate limit exceeded for key key_abc123. Limit: 60 req/min.",
    "type": "rate_limit_error",
    "retry_after": 12
  }
}

Rate Limits

Default rate limits per API key:

MetricFree TierProEnterprise
Requests per minute20120Custom
Tokens per minute40,000500,000Custom
Concurrent requests525Custom

Rate limit headers are included in every response:

X-RateLimit-Limit: 120
X-RateLimit-Remaining: 118
X-RateLimit-Reset: 1711648060

SDKs

Official SDKs handle authentication, retries, streaming, and error parsing:

  • Python: pip install swfte-sdk -- SDK Guide
  • JavaScript/TypeScript: npm install swfte-sdk -- SDK Guide
  • Java: com.swfte:swfte-sdk:1.0.0 -- SDK Guide

Next Steps

0
0
0
0

Enjoyed this article?

Get more insights on AI and enterprise automation delivered to your inbox.