Glossary¶
A reference of AI and machine learning terminology used throughout this documentation.
A¶
Agent : An AI system that can take actions, use tools, and work toward goals autonomously. Unlike simple chatbots, agents can execute multi-step tasks and interact with external systems.
Alignment : The practice of ensuring AI systems behave according to human intentions and values. An aligned AI does what you want, not just what you literally asked.
API (Application Programming Interface) : A way for programs to communicate with AI services. You send requests (prompts) and receive responses (completions).
B¶
Batch Processing : Processing multiple requests together rather than one at a time. Often more cost-effective for non-time-sensitive workloads.
C¶
Chain-of-Thought (CoT) : A prompting technique where the AI is asked to show its reasoning step-by-step, often improving accuracy on complex tasks.
Chunking : Splitting documents into smaller pieces for processing or retrieval. Important in RAG systems where context windows are limited.
Completion : The text generated by an AI model in response to a prompt. Also called a "response" or "output."
Context Window : The maximum amount of text (measured in tokens) that a model can process at once. Includes both input and output.
Conversation History : The record of previous messages in a chat interaction. Models use this to maintain coherent multi-turn conversations.
E¶
Embedding : A numerical representation of text (or other data) as a vector. Embeddings capture semantic meaning, allowing similarity comparisons.
Evaluation (Evals) : The process of measuring AI system quality. Includes metrics, test sets, and benchmarks.
F¶
Few-Shot Learning : Providing examples in the prompt to guide the model's output format or behavior. "Few-shot" = a few examples; "zero-shot" = no examples.
Fine-Tuning : Training a pre-trained model on additional data to specialize it for specific tasks or domains.
Function Calling : A capability where the model can request to call predefined functions, enabling it to take actions or retrieve information.
G¶
Grounding : Connecting AI outputs to factual information or source documents. RAG is a grounding technique.
Guardrails : Constraints and safety measures that limit AI behavior. Includes input filters, output validation, and action restrictions.
H¶
Hallucination : When an AI generates plausible-sounding but factually incorrect information. A significant challenge in AI reliability.
Human-in-the-Loop (HITL) : System design where humans review, approve, or correct AI outputs before they take effect.
I¶
Inference : The process of generating outputs from a trained model. When you call an AI API, you're running inference.
Instruction Following : A model's ability to follow directions given in the prompt. Modern models are specifically trained for this capability.
J¶
Jailbreaking : Attempts to bypass an AI model's safety guidelines or restrictions, often through creative prompting.
L¶
Latency : The time between sending a request and receiving a response. Includes network time and model processing time.
LLM (Large Language Model) : AI models trained on large amounts of text that can understand and generate human language.
M¶
MCP (Model Context Protocol) : An open protocol for connecting AI systems to external data sources and tools in a standardized way.
Model : The trained AI system that processes inputs and generates outputs. Different models have different capabilities and characteristics.
Multi-Modal : Models that can process multiple types of input (text, images, audio) rather than just text.
P¶
Parameter : In AI models, parameters are the learned values that determine model behavior. More parameters generally means more capability (and cost).
Prompt : The input text you send to an AI model. Includes instructions, context, and any examples.
Prompt Engineering : The practice of crafting effective prompts to get desired outputs from AI models.
Prompt Injection : A security vulnerability where malicious inputs manipulate AI behavior by overriding original instructions.
R¶
RAG (Retrieval-Augmented Generation) : A technique that enhances AI responses by first retrieving relevant information and including it in the prompt.
Rate Limit : Restrictions on how many requests you can make to an API in a given time period.
ReAct (Reasoning + Acting) : An agent architecture pattern where the AI alternates between reasoning about what to do and taking actions.
Retrieval : Finding relevant information from a dataset or knowledge base, typically using semantic search.
S¶
Semantic Search : Search based on meaning rather than exact keyword matching. Uses embeddings to find conceptually similar content.
Streaming : Receiving AI responses token-by-token as they're generated, rather than waiting for the complete response.
Structured Output : Constraining AI outputs to specific formats (like JSON) for reliable parsing.
System Prompt : Instructions that set the AI's behavior for an entire conversation. Defines role, constraints, and style.
T¶
Temperature : A parameter controlling randomness in AI outputs. Lower temperature = more deterministic; higher temperature = more varied/creative.
Token : The basic unit of text processing for language models. Roughly corresponds to parts of words (typically 3-4 characters in English).
Tool Use : The capability for AI to invoke external functions or APIs to accomplish tasks beyond text generation.
Training : The process of creating an AI model by exposing it to large amounts of data.
V¶
Vector Database : A database optimized for storing and searching embeddings (vectors). Essential for RAG implementations.
Vector : A list of numbers representing text or other data. In AI, vectors (embeddings) capture semantic meaning.
Z¶
Zero-Shot : Prompting a model to perform a task without providing examples. The model relies on its training and instructions only.
Common Abbreviations¶
| Abbreviation | Meaning |
|---|---|
| AI | Artificial Intelligence |
| API | Application Programming Interface |
| CoT | Chain-of-Thought |
| HITL | Human-in-the-Loop |
| LLM | Large Language Model |
| MCP | Model Context Protocol |
| ML | Machine Learning |
| NLP | Natural Language Processing |
| RAG | Retrieval-Augmented Generation |
| SSE | Server-Sent Events |