Skip to content

Glossary

A reference of AI and machine learning terminology used throughout this documentation.

A

Agent : An AI system that can take actions, use tools, and work toward goals autonomously. Unlike simple chatbots, agents can execute multi-step tasks and interact with external systems.

Alignment : The practice of ensuring AI systems behave according to human intentions and values. An aligned AI does what you want, not just what you literally asked.

API (Application Programming Interface) : A way for programs to communicate with AI services. You send requests (prompts) and receive responses (completions).

B

Batch Processing : Processing multiple requests together rather than one at a time. Often more cost-effective for non-time-sensitive workloads.

C

Chain-of-Thought (CoT) : A prompting technique where the AI is asked to show its reasoning step-by-step, often improving accuracy on complex tasks.

Chunking : Splitting documents into smaller pieces for processing or retrieval. Important in RAG systems where context windows are limited.

Completion : The text generated by an AI model in response to a prompt. Also called a "response" or "output."

Context Window : The maximum amount of text (measured in tokens) that a model can process at once. Includes both input and output.

Conversation History : The record of previous messages in a chat interaction. Models use this to maintain coherent multi-turn conversations.

E

Embedding : A numerical representation of text (or other data) as a vector. Embeddings capture semantic meaning, allowing similarity comparisons.

Evaluation (Evals) : The process of measuring AI system quality. Includes metrics, test sets, and benchmarks.

F

Few-Shot Learning : Providing examples in the prompt to guide the model's output format or behavior. "Few-shot" = a few examples; "zero-shot" = no examples.

Fine-Tuning : Training a pre-trained model on additional data to specialize it for specific tasks or domains.

Function Calling : A capability where the model can request to call predefined functions, enabling it to take actions or retrieve information.

G

Grounding : Connecting AI outputs to factual information or source documents. RAG is a grounding technique.

Guardrails : Constraints and safety measures that limit AI behavior. Includes input filters, output validation, and action restrictions.

H

Hallucination : When an AI generates plausible-sounding but factually incorrect information. A significant challenge in AI reliability.

Human-in-the-Loop (HITL) : System design where humans review, approve, or correct AI outputs before they take effect.

I

Inference : The process of generating outputs from a trained model. When you call an AI API, you're running inference.

Instruction Following : A model's ability to follow directions given in the prompt. Modern models are specifically trained for this capability.

J

Jailbreaking : Attempts to bypass an AI model's safety guidelines or restrictions, often through creative prompting.

L

Latency : The time between sending a request and receiving a response. Includes network time and model processing time.

LLM (Large Language Model) : AI models trained on large amounts of text that can understand and generate human language.

M

MCP (Model Context Protocol) : An open protocol for connecting AI systems to external data sources and tools in a standardized way.

Model : The trained AI system that processes inputs and generates outputs. Different models have different capabilities and characteristics.

Multi-Modal : Models that can process multiple types of input (text, images, audio) rather than just text.

P

Parameter : In AI models, parameters are the learned values that determine model behavior. More parameters generally means more capability (and cost).

Prompt : The input text you send to an AI model. Includes instructions, context, and any examples.

Prompt Engineering : The practice of crafting effective prompts to get desired outputs from AI models.

Prompt Injection : A security vulnerability where malicious inputs manipulate AI behavior by overriding original instructions.

R

RAG (Retrieval-Augmented Generation) : A technique that enhances AI responses by first retrieving relevant information and including it in the prompt.

Rate Limit : Restrictions on how many requests you can make to an API in a given time period.

ReAct (Reasoning + Acting) : An agent architecture pattern where the AI alternates between reasoning about what to do and taking actions.

Retrieval : Finding relevant information from a dataset or knowledge base, typically using semantic search.

S

Semantic Search : Search based on meaning rather than exact keyword matching. Uses embeddings to find conceptually similar content.

Streaming : Receiving AI responses token-by-token as they're generated, rather than waiting for the complete response.

Structured Output : Constraining AI outputs to specific formats (like JSON) for reliable parsing.

System Prompt : Instructions that set the AI's behavior for an entire conversation. Defines role, constraints, and style.

T

Temperature : A parameter controlling randomness in AI outputs. Lower temperature = more deterministic; higher temperature = more varied/creative.

Token : The basic unit of text processing for language models. Roughly corresponds to parts of words (typically 3-4 characters in English).

Tool Use : The capability for AI to invoke external functions or APIs to accomplish tasks beyond text generation.

Training : The process of creating an AI model by exposing it to large amounts of data.

V

Vector Database : A database optimized for storing and searching embeddings (vectors). Essential for RAG implementations.

Vector : A list of numbers representing text or other data. In AI, vectors (embeddings) capture semantic meaning.

Z

Zero-Shot : Prompting a model to perform a task without providing examples. The model relies on its training and instructions only.


Common Abbreviations

Abbreviation Meaning
AI Artificial Intelligence
API Application Programming Interface
CoT Chain-of-Thought
HITL Human-in-the-Loop
LLM Large Language Model
MCP Model Context Protocol
ML Machine Learning
NLP Natural Language Processing
RAG Retrieval-Augmented Generation
SSE Server-Sent Events