Glossary¶

A reference of AI and machine learning terminology used throughout this documentation.

A¶

Agent : An AI system that can take actions, use tools, and work toward goals autonomously. Unlike simple chatbots, agents can execute multi-step tasks and interact with external systems.

Alignment : The practice of ensuring AI systems behave according to human intentions and values. An aligned AI does what you want, not just what you literally asked.

API (Application Programming Interface) : A way for programs to communicate with AI services. You send requests (prompts) and receive responses (completions).

B¶

Batch Processing : Processing multiple requests together rather than one at a time. Often more cost-effective for non-time-sensitive workloads.

C¶

Chain-of-Thought (CoT) : A prompting technique where the AI is asked to show its reasoning step-by-step, often improving accuracy on complex tasks.

Chunking : Splitting documents into smaller pieces for processing or retrieval. Important in RAG systems where context windows are limited.

Completion : The text generated by an AI model in response to a prompt. Also called a "response" or "output."

Context Window : The maximum amount of text (measured in tokens) that a model can process at once. Includes both input and output.

Conversation History : The record of previous messages in a chat interaction. Models use this to maintain coherent multi-turn conversations.

E¶

Embedding : A numerical representation of text (or other data) as a vector. Embeddings capture semantic meaning, allowing similarity comparisons.

Evaluation (Evals) : The process of measuring AI system quality. Includes metrics, test sets, and benchmarks.

F¶

Few-Shot Learning : Providing examples in the prompt to guide the model's output format or behavior. "Few-shot" = a few examples; "zero-shot" = no examples.

Fine-Tuning : Training a pre-trained model on additional data to specialize it for specific tasks or domains.

Function Calling : A capability where the model can request to call predefined functions, enabling it to take actions or retrieve information.

G¶

Grounding : Connecting AI outputs to factual information or source documents. RAG is a grounding technique.

Guardrails : Constraints and safety measures that limit AI behavior. Includes input filters, output validation, and action restrictions.

H¶

Hallucination : When an AI generates plausible-sounding but factually incorrect information. A significant challenge in AI reliability.

Human-in-the-Loop (HITL) : System design where humans review, approve, or correct AI outputs before they take effect.

I¶

Inference : The process of generating outputs from a trained model. When you call an AI API, you're running inference.

Instruction Following : A model's ability to follow directions given in the prompt. Modern models are specifically trained for this capability.

J¶

Jailbreaking : Attempts to bypass an AI model's safety guidelines or restrictions, often through creative prompting.

L¶

Latency : The time between sending a request and receiving a response. Includes network time and model processing time.

LLM (Large Language Model) : AI models trained on large amounts of text that can understand and generate human language.

M¶

MCP (Model Context Protocol) : An open protocol for connecting AI systems to external data sources and tools in a standardized way.

Model : The trained AI system that processes inputs and generates outputs. Different models have different capabilities and characteristics.

Multi-Modal : Models that can process multiple types of input (text, images, audio) rather than just text.

P¶

Parameter : In AI models, parameters are the learned values that determine model behavior. More parameters generally means more capability (and cost).

Prompt : The input text you send to an AI model. Includes instructions, context, and any examples.

Prompt Engineering : The practice of crafting effective prompts to get desired outputs from AI models.

Prompt Injection : A security vulnerability where malicious inputs manipulate AI behavior by overriding original instructions.

R¶

RAG (Retrieval-Augmented Generation) : A technique that enhances AI responses by first retrieving relevant information and including it in the prompt.

Rate Limit : Restrictions on how many requests you can make to an API in a given time period.

ReAct (Reasoning + Acting) : An agent architecture pattern where the AI alternates between reasoning about what to do and taking actions.

Retrieval : Finding relevant information from a dataset or knowledge base, typically using semantic search.

S¶

Semantic Search : Search based on meaning rather than exact keyword matching. Uses embeddings to find conceptually similar content.

Streaming : Receiving AI responses token-by-token as they're generated, rather than waiting for the complete response.

Structured Output : Constraining AI outputs to specific formats (like JSON) for reliable parsing.

System Prompt : Instructions that set the AI's behavior for an entire conversation. Defines role, constraints, and style.

T¶

Temperature : A parameter controlling randomness in AI outputs. Lower temperature = more deterministic; higher temperature = more varied/creative.

Token : The basic unit of text processing for language models. Roughly corresponds to parts of words (typically 3-4 characters in English).

Tool Use : The capability for AI to invoke external functions or APIs to accomplish tasks beyond text generation.

Training : The process of creating an AI model by exposing it to large amounts of data.

V¶

Vector Database : A database optimized for storing and searching embeddings (vectors). Essential for RAG implementations.

Vector : A list of numbers representing text or other data. In AI, vectors (embeddings) capture semantic meaning.

Z¶

Zero-Shot : Prompting a model to perform a task without providing examples. The model relies on its training and instructions only.

Common Abbreviations¶

Abbreviation	Meaning
AI	Artificial Intelligence
API	Application Programming Interface
CoT	Chain-of-Thought
HITL	Human-in-the-Loop
LLM	Large Language Model
MCP	Model Context Protocol
ML	Machine Learning
NLP	Natural Language Processing
RAG	Retrieval-Augmented Generation
SSE	Server-Sent Events