AI Vendor Lock-In, in Plain English
What lock-in actually means when your AI provider is a chat endpoint, not a database. Seven dimensions, one ugly case study, a vendor scorecard, and the Lock-in Tax Calculator — a 5-question rubric that estimates your annual lock-in cost in under a minute.
Lock-in is no longer about the model
For a long time the AI lock-in conversation was "but we picked the best model." In May 2026 that question is effectively settled at the top: the four leading frontier models are within 40 Elo of each other on LMArena. Capability parity makes switching cost the dominant cost. Lock-in is now a function of how your stack is shaped, not which model is "best." A 67% majority of enterprise AI buyers now list provider independence as a top-three procurement criterion, up from less than 20% two years ago.
Case study: the Builder.ai collapse
The canonical 2020s lock-in failure mode. Builder.ai sold a vertically-integrated "build apps with AI" platform. Customers wrote their workflows, prompts, integrations, and data ingestion against Builder.ai's proprietary surface. When the company collapsed in 2024 those customers had no portable artifacts. The prompts were tied to Builder's templating layer. The agents were tied to Builder's tool schema. The data was in Builder's storage format. Every customer had to rebuild from scratch.
The lesson is not "avoid platforms." The lesson is that portable artifacts matter more than features. Prompts, tool definitions, and evals should be exportable in formats that survive their authoring tool. If a vendor cannot export your work in a portable format, you have full lock-in.
The seven lock-in dimensions
| Dimension | Sticky example | Loose example | Weight |
|---|---|---|---|
| Tool / function-calling schema How portable are your agent tool definitions across providers? | OpenAI function calling, Anthropic tool_use | Generic JSON-schema with adapter layer | 5/5 |
| Prompt structure Will your prompts re-translate cleanly, or are they vendor-specific? | Anthropic XML tagging, GPT system role conventions | Plain-English instruction prompts in markdown | 4/5 |
| Tokenizer behaviour Does the same input text produce the same number of billed tokens? | Anthropic 4.7 tokenizer adds 35% over 4.6 same input | BPE-50K family tokenizers (Llama, GPT-style) | 3/5 |
| Data egress and IAM coupling Cost and friction of moving conversation logs and embeddings out. | Vertex IAM, Azure OpenAI in tenant VPCs, Bedrock IAM | S3 export, plain object storage | 5/5 |
| Eval harness re-run Cost of re-validating prompts and accuracy on the new provider. | Hand-tuned prompts coupled to one model family | Eval suite parametrised by provider | 4/5 |
| Behavioural drift Does the new provider output the same shape, tone, and structure? | Downstream parsers tuned to one model | Schema-validated output with retry on parse fail | 4/5 |
| Compliance re-certification SOC 2, DPA, regional data-residency, regulated-industry audit costs. | Single-vendor DPA covers your whole AI estate | Vendor-agnostic data-handling controls | 3/5 |
Vendor scorecard
Lock-in score (0 = none, 35 = severe), May 2026 OpenAI Direct 28/35 ████████████████████████ severe Anthropic Direct 26/35 ██████████████████████ severe Azure OpenAI 26/35 ██████████████████████ severe Google Vertex 24/35 █████████████████████ material AWS Bedrock 21/35 ██████████████████ material Together AI 14/35 ████████████ manageable OpenRouter 13/35 ███████████ manageable Swfte Connect 10/35 ████████ manageable Self-hosted open-weight 7/35 ██████ near-zero
| Vendor | Category | Score |
|---|---|---|
| OpenAI Direct Proprietary tool schema; sticky JSON output handling. | closed-frontier | 28/35 |
| Anthropic Direct Distinct prompt style and tokenizer churn between Opus versions. | closed-frontier | 26/35 |
| Google Vertex Strong compliance, sticky IAM and egress. | closed-frontier | 24/35 |
| AWS Bedrock Multi-provider but AWS IAM/VPC coupling is real. | gateway | 21/35 |
| Azure OpenAI OpenAI-only gateway; OpenAI lock-in plus Azure coupling. | gateway | 26/35 |
| OpenRouter OpenAI-compatible API; switching is a config change. | aggregator | 13/35 |
| Together AI Open-weights aggregator with shifting hosted catalogue. | aggregator | 14/35 |
| Self-hosted (DeepSeek, Gemma) Apache 2.0; switching is a model file swap. | open-frontier | 7/35 |
| Swfte Connect Provider-agnostic abstraction designed to drop sticky surfaces. | gateway | 10/35 |
Full audit framework in our Model Exit-Cost Audit Framework writeup.
Mitigation patterns
| Pattern | Reduces | Cost |
|---|---|---|
| OpenAI-compatible gateway in front of all providers Most impactful single move. Drops 4 of 7 dimensions. | Tool schema, prompt structure, behavioural drift | Low — one engineer-month |
| Provider-parametric eval harness Run all evals on every provider release. Catches drift early. | Eval re-run, behavioural drift | Medium — one engineer-quarter |
| Schema-validated structured output Pydantic / Zod validation with retries on parse failure. | Behavioural drift | Low — protocol-level fix |
| Cloud-neutral data layer Object-store as primary; vendor-specific endpoints as caches. | Data egress, IAM coupling | Medium — architectural |
| Vendor-agnostic compliance posture DPAs and data-handling controls written for the workload, not the vendor. | Compliance re-cert | High — legal-first |
The Lock-in Tax Calculator
Our framework: a five-question rubric that estimates your annual AI vendor lock-in tax. Score each question on the high-risk vs low-risk axis from 0 to its weight. Sum the answers. Multiply by ~$25K to get a rough annual unbudgeted-cost figure for a mid-size enterprise running production AI workloads.
| Question | High-risk | Low-risk | Max |
|---|---|---|---|
| Are tool definitions written for one provider? | Yes, OpenAI-only function calling | No, generic JSON-schema with adapter | 5 |
| Are prompts hand-tuned per model family? | Yes, separate Anthropic vs OpenAI prompts | No, model-agnostic instruction format | 4 |
| Is data residency tied to one cloud IAM? | Yes, Vertex / Bedrock / Azure-only | No, neutral object store | 5 |
| Is the eval harness parameterised by provider? | No, manual provider-specific runs | Yes, one-flag provider switch | 4 |
| Are downstream parsers tuned to one output style? | Yes, format-coupled JSON parsers | No, schema-validated with retries | 4 |
Reading the score (out of 22): 0-5 Low lock-in tax ~$0-125K/year hidden cost 6-12 Manageable ~$150K-300K/year hidden cost 13-18 Material ~$325K-450K/year hidden cost <- typical 19-22 Severe ~$475K+/year hidden cost Example: a typical enterprise scoring 18 pays approximately $450,000/year in deferred switching costs that show up as unplanned engineering work, eval reruns, and prompt re-tuning.
What to do this quarter
- Run the Lock-in Tax Calculator on your own stack. Five questions, one number. Use it as the baseline you measure progress against quarter-on-quarter.
- Put an OpenAI-compatible gateway in front of every provider. Highest single-action lock-in reduction. One engineer-month drops 4 of 7 dimensions.
- Externalise prompts, tools, and evals as portable artifacts. Markdown for prompts, generic JSON-schema for tools, parameter-driven eval harness. Avoid vendor-templating layers.
- Measure tokenizer drift before, during, and after every model launch. The Anthropic 4.6 to 4.7 tokenizer change added 35% to effective cost without any list-price change. Treat this as a budget event.
- Maintain a tested fallback provider for every production route. If you cannot serve traffic on a second provider with one config change, you have lock-in regardless of contract.
- Audit data egress and IAM annually. Cloud IAM coupling silently grows. Once a year, attempt a clean export — it will tell you exactly what is sticky.
- Write DPAs and compliance controls vendor-agnostic. The cheapest compliance re-cert is the one you do not have to do because the controls were portable from day one.
Related reading
- AI Vendor Lock-in Leaderboard — full vendor-by-vendor scorecard
- Model Exit-Cost Audit Framework 2026 — operational deep-dive
- Avoid AI Vendor Lock-In Enterprise Guide
- Multi-Model AI Strategy
Teams running provider-agnostic AI infra typically front their providers through Swfte Connect for a single OpenAI-compatible endpoint and explicit per-route fallback, eliminating the four heaviest of the seven lock-in dimensions with a single integration.
Sources: vendor scorecard from Swfte Connect telemetry on representative enterprise stacks; Lock-in Tax dollar estimates from procurement-team interviews at mid-size SaaS firms running $500K-$5M/year of LLM spend, May 2026-05-06.