Bookmark This Page — A living reference for AI terminology, updated regularly as the field evolves. Use Ctrl+F / Cmd+F to search for specific terms.

A

Agentic AI

AI systems capable of autonomous decision-making, planning, and action execution. Unlike reactive AI that responds to prompts, agentic AI proactively pursues goals, uses tools, and adapts its approach based on results. See also: AI Agent.

AI Agent

Autonomous software that perceives its environment, reasons about what to do, and takes actions to achieve a goal without step-by-step human instructions. Agents operate in a loop — observe, plan, act, evaluate — and can use tools, call APIs, and coordinate with other agents. Learn more about AI agents →

AI Alignment

The practice of ensuring AI systems behave in ways consistent with human intentions, values, and goals. Alignment research focuses on making AI systems helpful, harmless, and honest.

AI Hallucination

When an AI model generates confident but factually incorrect or fabricated information. Hallucinations occur because language models predict plausible text rather than verified facts. Mitigated through grounding, retrieval-augmented generation, and output validation.

AI Orchestration

The coordination of multiple AI models, agents, or services to accomplish complex tasks. An orchestration layer routes requests, manages state, handles errors, and ensures components work together. Explore Swfte's orchestration platform →

Attention Mechanism

A neural network technique that lets models focus on relevant parts of the input when producing output. The foundation of the Transformer architecture. Self-attention allows each token to attend to every other token in the sequence.

AutoML (Automated Machine Learning)

Tools and techniques that automate the process of building machine learning models — including feature engineering, model selection, and hyperparameter tuning.

B

Batch Processing

Processing multiple inputs together rather than one at a time. In AI, batch inference processes many requests simultaneously for higher throughput at the cost of latency.

Benchmark

A standardized test used to evaluate AI model performance. Common benchmarks include MMLU (knowledge), HumanEval (coding), GSM8K (math), and MT-Bench (conversation).

BERT (Bidirectional Encoder Representations from Transformers)

A foundational language model from Google (2018) that processes text bidirectionally. While largely superseded by GPT-style models for generation, BERT-style models remain widely used for classification, search, and embedding tasks.

Bias (in AI)

Systematic errors in AI outputs that reflect skewed training data or flawed model design. Can manifest as unfair treatment of demographic groups, overrepresentation of certain viewpoints, or factual inaccuracies.

C

Chain-of-Thought (CoT)

A prompting technique where the AI model is instructed to show its reasoning step-by-step before giving a final answer. CoT significantly improves performance on complex reasoning tasks like math, logic, and multi-step problem-solving.

Chatbot

A conversational AI interface that responds to user messages. Unlike AI agents, chatbots typically cannot take autonomous actions, use tools, or execute multi-step plans. See AI agents vs chatbots →

Classification

A machine learning task where the model assigns input to predefined categories. Examples: spam detection (spam/not spam), sentiment analysis (positive/negative/neutral), image recognition (cat/dog/bird).

Computer Vision

AI that processes and understands visual information — images, video, and visual data. Applications include object detection, image classification, facial recognition, and medical imaging analysis.

Constitutional AI

An approach to AI alignment where the model is trained with a set of principles (a "constitution") that guide its behavior. The model learns to evaluate and revise its own outputs against these principles.

Context Window

The maximum amount of text an LLM can process in a single interaction, measured in tokens. Larger context windows (100K-1M+ tokens) enable processing of longer documents and maintaining longer conversations. See also: Token.

Copilot

An AI assistant that works alongside a human, suggesting actions but not executing them autonomously. Positioned between chatbots (passive) and agents (autonomous). Examples: GitHub Copilot (code suggestions), Microsoft 365 Copilot.

D

Data Augmentation

Techniques to artificially expand training datasets by creating modified versions of existing data. In NLP: paraphrasing, back-translation, synonym replacement. In computer vision: rotation, cropping, color adjustment.

Deep Learning

A subset of machine learning using neural networks with multiple layers (deep architectures). Deep learning excels at learning complex patterns from large datasets and powers most modern AI capabilities including language models and computer vision.

Diffusion Model

A generative AI model that creates data (typically images) by learning to reverse a gradual noising process. Models like Stable Diffusion, DALL-E, and Midjourney use this approach to generate images from text prompts.

Distillation (Knowledge Distillation)

Training a smaller, faster model (the "student") to replicate the behavior of a larger model (the "teacher"). Distillation preserves most of the larger model's capabilities at a fraction of the compute cost.

E

E-E-A-T (Experience, Expertise, Authoritativeness, Trustworthiness)

Google's framework for evaluating content quality. AI-generated content that demonstrates real expertise, authoritative sources, and trustworthy information ranks better in search results.

Embedding

A numerical vector representation of data (text, images, audio) that captures semantic meaning. Similar items have similar embeddings, enabling semantic search, clustering, and recommendation systems. See also: Vector Database.

Encoder-Decoder

A neural network architecture with two parts: an encoder that processes input into a compressed representation, and a decoder that generates output from that representation. Used in translation, summarization, and sequence-to-sequence tasks.

Evaluation (Evals)

The process of measuring AI model performance against specific criteria. Evals can be automated (benchmark scores) or human (quality ratings). Critical for comparing models, tracking improvements, and catching regressions.

F

Few-Shot Learning

Teaching an AI model to perform a task by providing a small number of examples (typically 2-10) in the prompt. The model generalizes from these examples to handle new inputs. Contrast with: Zero-Shot Learning.

Fine-Tuning

Adapting a pre-trained model to a specific task or domain by training it on additional data. Fine-tuning is more efficient than training from scratch because the model already understands general language patterns.

Foundation Model

A large AI model trained on broad data that can be adapted to many downstream tasks. GPT-4, Claude, Gemini, and Llama are foundation models. They serve as the "foundation" for chatbots, agents, and specialized applications.

Function Calling (Tool Use)

The ability of an LLM to generate structured output that invokes external functions or APIs. The model decides which tool to call and with what parameters, enabling AI agents to interact with external systems. See also: Tool Use.

G

Generative AI (GenAI)

AI that creates new content — text, images, audio, video, code — rather than analyzing existing data. Powered by foundation models trained on large datasets. Includes LLMs, diffusion models, and other generative architectures.

GPT (Generative Pre-trained Transformer)

A family of language models from OpenAI based on the Transformer architecture. "Generative" (creates text), "Pre-trained" (trained on large data before fine-tuning), "Transformer" (the underlying architecture). GPT-4, GPT-4o, and GPT-5 are successive versions.

Grounding

Connecting AI outputs to verified information sources (databases, documents, APIs) to reduce hallucination. A grounded response cites its sources and can be verified. See also: RAG.

Guardrails

Safety mechanisms that constrain AI behavior — input validation, output filtering, action restrictions, and escalation rules. Essential for production AI agents that take autonomous actions.

H

Hallucination

See AI Hallucination.

Human-in-the-Loop (HITL)

A system design where humans review, approve, or correct AI decisions at critical points. HITL ensures safety for high-stakes actions while allowing AI to handle routine work autonomously.

Hyperparameter

A configuration setting that controls the training process (learning rate, batch size, number of layers) rather than being learned from data. Hyperparameter tuning optimizes model performance.

I

Inference

The process of running a trained model on new input to generate predictions or outputs. Inference costs (compute, latency, energy) are a major factor in AI deployment economics.

Instruction Tuning

Training a language model to follow human instructions more reliably. Instruction-tuned models understand prompts like "summarize this document" or "write a Python function" better than base models.

Intent Recognition

Identifying what a user wants to accomplish from their natural language input. A critical component of conversational AI and AI agents that determines how to handle each request.

J

JSON Mode

A model output format where the LLM is constrained to produce valid JSON. Useful for structured data extraction, API responses, and tool calling where the output must be machine-parseable.

K

Knowledge Base

A structured collection of information that an AI system can search and reference. In AI agent architectures, the knowledge base provides factual grounding for responses and decisions.

Knowledge Graph

A data structure representing entities (people, places, concepts) and their relationships as a network of nodes and edges. Used to give AI systems structured world knowledge.

L

Large Language Model (LLM)

A neural network trained on massive text datasets that can understand and generate human language. LLMs are the "brain" of modern AI agents and chatbots. Examples: GPT-5, Claude Opus, Gemini 2.5, Llama 4.

Latency

The time delay between sending a request and receiving a response. In AI systems, latency includes network time, inference time, and any tool execution time. Critical for real-time applications.

LoRA (Low-Rank Adaptation)

An efficient fine-tuning technique that adapts a model by training small, low-rank matrices rather than modifying all model weights. LoRA reduces fine-tuning cost and storage while maintaining quality.

LLM Routing

Automatically selecting the optimal language model for each request based on task complexity, cost, latency requirements, and model capabilities. Intelligent routing reduces costs by 40-60% while maintaining quality.

M

MCP (Model Context Protocol)

An open protocol enabling AI agents to connect to external tools and data sources through a standardized interface. MCP provides a universal way for agents to interact with different systems without custom integrations for each.

Memory (Agent Memory)

The ability of an AI agent to retain and recall information across interactions. Short-term memory covers the current session; long-term memory persists across sessions using databases or vector stores.

Mixture of Experts (MoE)

A model architecture where different specialized sub-networks ("experts") handle different types of inputs. A gating network routes each input to the most relevant experts, enabling larger models with lower inference costs.

Multi-Agent System

An architecture where multiple specialized AI agents collaborate to accomplish complex tasks. A coordinator routes work to domain experts, each of which handles its specialty. Learn more →

AI that processes and generates multiple types of data — text, images, audio, video — within a single model. Multi-modal models can describe images, transcribe audio, and generate visual content.

N

Natural Language Processing (NLP)

The field of AI focused on enabling computers to understand, interpret, and generate human language. Encompasses tasks like translation, sentiment analysis, summarization, and question answering.

Neural Network

A computing system inspired by biological neural networks, consisting of interconnected nodes (neurons) organized in layers. Neural networks learn patterns from data by adjusting connection weights during training.

No-Code AI

Platforms that enable building AI applications — including agents, workflows, and automations — using visual interfaces without writing code. Compare no-code vs low-code AI →

O

Open Source AI

AI models and tools released under open licenses, allowing anyone to use, modify, and distribute them. Examples: Llama (Meta), Mistral, and Stable Diffusion. Contrast with proprietary models like GPT-5 and Claude.

Output Parsing

Extracting structured information from an LLM's natural language output. Output parsers convert free-text responses into data formats (JSON, lists, typed objects) that downstream systems can process.

P

Parameter

A value the model learns during training. Modern LLMs have billions to trillions of parameters. More parameters generally enable more complex reasoning but increase inference costs.

Prompt Engineering

The practice of designing effective inputs (prompts) to get optimal outputs from AI models. Techniques include few-shot examples, chain-of-thought reasoning, system messages, and structured instructions.

Prompt Injection

A security vulnerability where malicious input manipulates an AI system into ignoring its instructions or performing unintended actions. Mitigated through input validation, output filtering, and architectural safeguards.

R

RAG (Retrieval-Augmented Generation)

A technique that improves LLM accuracy by retrieving relevant information from external sources before generating a response. The model's response is grounded in retrieved facts rather than relying solely on training data.

Reasoning (AI Reasoning)

The ability of an AI model to think through problems logically, draw inferences, and arrive at conclusions. Advanced reasoning models use techniques like chain-of-thought and tree-of-thought to solve complex problems.

Reinforcement Learning from Human Feedback (RLHF)

A training technique where human evaluators rate model outputs, and this feedback trains a reward model that guides further model optimization. RLHF is key to making models helpful, harmless, and honest.

Retrieval

The process of finding relevant documents or data from a large collection. In RAG systems, retrieval uses vector similarity search or keyword matching to find information the LLM can use.

S

Semantic Search

Search that understands meaning rather than just matching keywords. Uses embeddings to find documents that are conceptually similar to the query, even if they use different words.

Structured Output

LLM responses in a predefined format (JSON, XML, typed objects) rather than free-form text. Critical for AI agents that need machine-readable outputs to interact with external systems.

Supervised Learning

A training approach where the model learns from labeled examples — input-output pairs where the correct answer is provided. The model learns to predict the output for new, unseen inputs.

Synthetic Data

Artificially generated data used for training AI models when real data is insufficient, expensive, or privacy-sensitive. Modern AI models can generate training data for other models.

T

Temperature

A parameter that controls randomness in LLM outputs. Temperature 0 produces the most deterministic (consistent) outputs; higher temperatures increase creativity and variability. Typical range: 0-2.

Token

The basic unit of text that LLMs process. Roughly 1 token ≈ 4 characters or 0.75 words in English. Models have context window limits measured in tokens (e.g., 128K tokens). Pricing is typically per million tokens.

Tool Use

The ability of an AI agent to invoke external tools — APIs, databases, code interpreters, web browsers — to accomplish tasks. Tool use is what transforms an LLM from a text generator into an agent that can act in the real world. See also: Function Calling.

Transformer

The neural network architecture underlying all modern LLMs, introduced in the 2017 paper "Attention Is All You Need." Transformers process input in parallel using self-attention mechanisms, enabling efficient training on massive datasets.

Transfer Learning

Applying knowledge learned from one task to a different but related task. Foundation models embody transfer learning at scale — trained on general data, then adapted to specific tasks through fine-tuning or prompting.

U

Unsupervised Learning

A training approach where the model finds patterns in data without labeled examples. LLM pre-training is largely unsupervised — the model learns language patterns by predicting the next token in vast text datasets.

V

Vector Database

A database optimized for storing and searching high-dimensional vectors (embeddings). Used for semantic search, recommendation systems, and RAG implementations. Examples: Pinecone, Weaviate, Chroma, Qdrant.

Vision-Language Model (VLM)

A multi-modal AI model that processes both images and text. VLMs can describe images, answer questions about visual content, and generate images from text descriptions.

W

Weights

The numerical parameters in a neural network that determine how input data is transformed into output. Training adjusts weights to minimize prediction errors. Model "weights" and "parameters" are often used interchangeably.

Workflow Automation

Using AI agents to automate multi-step business processes that span multiple systems. Unlike traditional automation (rigid, rule-based), AI workflow automation adapts to edge cases and handles unstructured data.

Z

Zero-Shot Learning

The ability of an AI model to perform a task it was not explicitly trained for, using only its general knowledge and the task description. Example: asking an LLM to classify sentiment without providing examples. Contrast with: Few-Shot Learning.

Missing a term? This glossary is continuously updated. If you need a definition that is not here, contact us and we will add it.

A