|
English

The Swfte SDK provides a unified interface to every major AI provider. This guide covers the full SDK surface area across Python, JavaScript, and Java -- from basic chat completions to streaming, function calling, embeddings, and error handling.


Installation

# Python (3.9+)
pip install swfte-sdk

Client Initialization

from swfte import SwfteClient

# From environment variable SWFTE_API_KEY
client = SwfteClient()

# Explicit key
client = SwfteClient(api_key="sk-swfte-...")

# Custom base URL (self-hosted gateway)
client = SwfteClient(
    api_key="sk-swfte-...",
    base_url="https://gateway.yourcompany.com"
)

Chat Completions

The core method for text generation. Supports all standard parameters.

response = client.chat.completions.create(
    model="anthropic:claude-sonnet-4",
    messages=[
        {"role": "system", "content": "You are a technical writer."},
        {"role": "user", "content": "Write a function docstring for a binary search."}
    ],
    max_tokens=512,
    temperature=0.3,
    top_p=0.9,
    stop=["\n\n\n"]
)

content = response.choices[0].message.content
usage = response.usage  # prompt_tokens, completion_tokens, total_tokens
model_used = response.model  # The actual model that served the request

Streaming Responses

For real-time output in chat interfaces or long-form generation.

stream = client.chat.completions.create(
    model="openai:gpt-5",
    messages=[{"role": "user", "content": "Write a short story about a robot."}],
    max_tokens=1024,
    stream=True
)

for chunk in stream:
    delta = chunk.choices[0].delta
    if delta.content:
        print(delta.content, end="", flush=True)

Function Calling (Tool Use)

Define tools that the model can invoke. Swfte normalizes the function calling interface across providers.

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get current weather for a location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {"type": "string", "description": "City name"},
                    "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
                },
                "required": ["location"]
            }
        }
    }
]

response = client.chat.completions.create(
    model="openai:gpt-5",
    messages=[{"role": "user", "content": "What's the weather in London?"}],
    tools=tools,
    tool_choice="auto"
)

tool_call = response.choices[0].message.tool_calls[0]
print(f"Function: {tool_call.function.name}")
print(f"Arguments: {tool_call.function.arguments}")

Embeddings

Generate vector embeddings for search, clustering, and RAG pipelines.

response = client.embeddings.create(
    model="openai:text-embedding-3-large",
    input=["Swfte Connect is an AI gateway", "It routes traffic across providers"]
)

for embedding in response.data:
    print(f"Dimension: {len(embedding.embedding)}")
    print(f"First 5 values: {embedding.embedding[:5]}")

Error Handling

The SDK raises typed exceptions you can catch and handle gracefully.

from swfte.exceptions import (
    SwfteAuthError,
    SwfteRateLimitError,
    SwfteProviderError,
    SwfteTimeoutError
)

try:
    response = client.chat.completions.create(
        model="openai:gpt-5",
        messages=[{"role": "user", "content": "Hello"}]
    )
except SwfteAuthError:
    print("Invalid API key -- check your credentials")
except SwfteRateLimitError as e:
    print(f"Rate limited -- retry after {e.retry_after}s")
except SwfteProviderError as e:
    print(f"Provider {e.provider} error: {e.message}")
except SwfteTimeoutError:
    print("Request timed out -- consider increasing timeout or using a faster model")

Request Metadata

Every response includes metadata about routing and cost:

response = client.chat.completions.create(
    model="openai:gpt-5",
    messages=[{"role": "user", "content": "Hello"}]
)

print(response.usage.prompt_tokens)
print(response.usage.completion_tokens)
print(response.usage.total_tokens)
print(response.model)              # Actual model used
print(response.swfte_metadata)     # Gateway metadata (cost, latency, provider)

The swfte_metadata field includes:

  • cost -- Actual cost of the request in USD
  • latency_ms -- End-to-end latency in milliseconds
  • provider -- Which provider served the request
  • cached -- Whether a cached response was returned

Next Steps

0
0
0
0

Enjoyed this article?

Get more insights on AI and enterprise automation delivered to your inbox.