Shared Vocabulary

AI and Cloud Glossary

Use this glossary to align teams on technical language before architecture reviews, vendor evaluations, and implementation planning.

Key Takeaways

  • Use shared definitions early to prevent scope drift in AI and cloud projects.
  • Prioritize retrieval quality and evaluation before “more prompting”.
  • Standardize vocabulary across product, engineering, and security teams.
Core Terms

Defined Terms

Practical definitions focused on real project execution and production operations.

Agentic AI

AI systems that can plan, decide, and execute multi-step tasks using tools and memory instead of producing one-step text output.

Context Window

The amount of text or tokens an LLM can process in a single request. Larger windows support longer documents and richer prompts.

Embedding

A numerical vector representation of text used for semantic search, retrieval, clustering, and similarity matching.

Fine-Tuning

Additional model training on task or domain data to improve style consistency, behavior, and task accuracy.

Grounding

Constraining model responses to verified source context so answers are factual and traceable to known documents.

Inference

The process of running a trained model to generate predictions, completions, classifications, or decisions.

Kubernetes

An orchestration platform for deploying and scaling containerized applications, including AI APIs and model-serving stacks.

LangChain

A framework for building LLM applications with prompt chains, tool calling, retrieval integration, and agent workflows.

LLM

Large Language Model. A neural model trained on broad text corpora that can understand and generate natural language.

LoRA

Low-Rank Adaptation. A parameter-efficient fine-tuning technique that adapts large models with lower compute cost.

MLOps

Practices that connect model development and operations: training pipelines, deployment, monitoring, rollback, and governance.

Observability

Telemetry and diagnostics for distributed systems, including logs, traces, and metrics used to debug model or service behavior.

Prompt Engineering

Systematic design of prompts, instructions, and constraints to improve model reliability and output quality.

RAG

Retrieval-Augmented Generation. A pattern where relevant documents are retrieved and injected into prompts before answer generation.

Terraform

Infrastructure as Code tooling used to define, provision, and version cloud resources across providers.

Tool Calling

Capability allowing LLM-driven systems to invoke APIs, databases, and external services as part of response generation.

Vector Database

A storage engine optimized for nearest-neighbor search on embeddings, often used in RAG and recommendation systems.