Defined term

RAG (Retrieval-Augmented Generation)

Generation grounded in retrieved source documents rather than the model's parametric memory alone.

Retrieval-Augmented Generation is the pattern where a query is first used to retrieve relevant passages from a curated source (vector store, search index, database), and those passages are passed to the model as context for the answer. RAG reduces hallucination on factual queries, allows answers to cite sources, and lets the system stay current without retraining. Production RAG requires source curation, chunking strategy, embeddings, retrieval evaluation, and answer evaluation.

When it matters

Use RAG when factual accuracy requires citing specific source material (policy, contracts, customer history). Skip RAG when the model's parametric knowledge is sufficient or when latency is critical.

Real example

A support agent that retrieves the 5 most relevant past tickets + the customer's product config + the relevant policy passages, then generates a grounded answer with inline citations the agent can verify in under 10 seconds.

KPIs to watch

Retrieval precision@5 (>0.75 target), answer groundedness rate (>90%), source citation completeness (100% on factual claims).

Related terms

Embeddings

Numerical vectors that represent the meaning of a text, image, or other piece of content.

Vector store

A database optimized for similarity search over embeddings.

Grounding

Anchoring model output to verifiable source material to reduce hallucination.

AI-native PR stack

The instrumented tooling an AI-native PR team runs for media research, pitch drafting, monitoring, and measurement — with humans owning relationships and final messaging.

See it in action

We use this every week

Send a short brief and we'll walk you through how RAG (Retrieval-Augmented Generation) shows up in a real engagement we're running. We reply within one business day.

Start a project →