← Glossary/Models & foundations

Defined term

Extended thinking

A model mode that performs longer internal reasoning before producing the answer.

Extended thinking lets the model allocate more inference compute to reasoning before responding. Most useful for hard problems where accuracy matters more than latency. Architecture decision: route only the hardest cases to extended thinking, keep routine traffic on fast paths.

When it matters

When the task requires deep reasoning where the model benefits from allocating more compute before responding. Use on the 5-10% of cases where quality matters more than latency.

Real example

A high-stakes underwriting decision routed through Claude with extended-thinking enabled. The model spends 30s reasoning internally (vs 3s standard), reviewing each policy clause, before recommending approve/decline. Accuracy: +18% on adversarial cases.

KPIs to watch

Quality lift on hard subset (target: >10pp), latency P95 on extended-thinking path (typically 20-60s), cost overhead (2-5× standard inference).

Related terms

See it in action

We use this every week

Book a 30-min call and we'll walk you through how Extended thinking shows up in a real engagement we're running.

Book a 30-min call