← Glossary/Evaluation & quality

Defined term

Prompt versioning

Treating prompts as code: stored, diffed, reviewed, and rolled back like any production artifact.

Prompt versioning means every prompt that goes to production has a version identifier, a commit history, and a deployment record. When output quality changes, you can attribute it to a specific prompt change. Production teams pair this with evaluation harnesses that score new prompt versions against labelled test sets before promotion.

When it matters

When more than one person edits prompts, or when prompt changes correlate with production incidents. Without versioning, you cannot attribute quality regressions to a specific change.

Real example

A revenue-ops agent on v2.3.1 of the qualification prompt; reply rate drops 18% in week 12; team rolls back to v2.2.7 in under 1 hour, isolates the change, and reships v2.3.2 with the regression fixed.

KPIs to watch

Mean time to rollback (<30 min target), eval test pass rate per version (>95% before promotion), regression incidents per quarter (<1).

Related terms

See it in action

We use this every week

Book a 30-min call and we'll walk you through how Prompt versioning shows up in a real engagement we're running.

Book a 30-min call