What is Query Rewriting?

Query Rewriting is the process of transforming a user's original question into one or more clearer, expanded, or retrieval-friendly queries before search.

How It Works

Query rewriting addresses a common RAG problem: users ask underspecified, conversational, or context-dependent questions, while retrieval systems need search-friendly terms. A rewrite step may expand acronyms, add missing entities from conversation history, split a complex request into subqueries, translate language, or normalize product names. The risk is semantic drift: an overactive rewrite can change the user's intent and retrieve confident but irrelevant evidence. Good systems log both the original and rewritten queries and evaluate whether rewrites improve grounded answers.

Key Characteristics

Improves retrieval by making implicit or ambiguous user intent more searchable
Can produce a single rewritten query or multiple focused subqueries
May use rules, an LLM, conversation history, metadata, or domain dictionaries
Carries semantic-drift risk if the rewrite changes the user's intent
Should be evaluated against retrieval quality and final answer grounding

Common Use Cases

Expanding acronyms before searching technical documentation
Resolving pronouns or follow-up questions using chat history
Splitting a multi-part user request into separate retrieval queries
Translating a query to match the language of the indexed corpus
Normalizing product names, API versions, or internal terminology

Example

Loading code...

Frequently Asked Questions

Is query rewriting the same as prompt rewriting?

No. Query rewriting targets retrieval quality, while prompt rewriting usually targets model instruction clarity or output behavior.

When should a RAG system rewrite queries?

It is useful for follow-up questions, acronyms, vague wording, multilingual corpora, and complex requests that need multiple searches.

What is semantic drift in query rewriting?

Semantic drift happens when the rewritten query changes the user's intent, causing the retriever to fetch plausible but wrong evidence.

How can query rewriting be evaluated?

Compare retrieval recall, context precision, and answer grounding with and without rewriting on a representative query set.

Related Tools

Text Analyzer

Free online text analyzer tool. Count words, characters, sentences, paragraphs. Calculate reading time, speaking time, and analyze word frequency. All processing happens in your browser.

JSON Formatter

Format, beautify, validate and minify JSON online for free. Features syntax highlighting, tree view, history tracking, and one-click copy. No signup required. 100% client-side processing for privacy.

AI Websites Directory

An authoritative, comprehensive, and continuously updated AI resources directory. It covers global and domestic model providers, open-source ecosystems, research indexes and leaderboards, developer platforms, and curated tool catalogs—helping you quickly discover, compare, and choose the right AI products and references. Supports keyword search and favorites, with clear category sections and an expanding dataset for better experience.

Related Terms

RAG

RAG (Retrieval-Augmented Generation) is an AI architecture that enhances large language model outputs by retrieving relevant information from external knowledge bases before generating responses, combining the strengths of information retrieval systems with generative AI to produce more accurate, up-to-date, and verifiable answers.

Retriever

Retriever is a query-to-context component that receives a user or agent query and returns relevant documents, chunks, records, passages, or tool-readable context for downstream reasoning and generation.

Dense Retrieval

Dense Retrieval is a semantic search method that represents queries and documents as dense embedding vectors and retrieves results by vector similarity.

Sparse Retrieval

Sparse Retrieval is a lexical search method that represents queries and documents with sparse term-weight vectors and retrieves results by matching explicit terms.

Context Engineering: Four-Layer Architecture Patterns

A practical, version-aware four-layer model for AI context: instructions, knowledge, memory, and orchestration. Learn how to set budgets, route retrieval, compact memory, validate tool output, and measure quality without treating token ratios or model behavior as universal facts.

2026-07-19

Context Engineering: Selection, Evidence, and State for LLM Systems

A practical, provider-neutral guide to context engineering for LLM and Agent systems. Design a context contract, select and retrieve evidence, compress without losing meaning, persist state with provenance and deletion, budget tokens and latency, defend against untrusted content, and evaluate context changes with task-level evidence.

2026-04-01

Context Engineering in Practice: Build an Auditable Task Packet

A hands-on companion to context engineering for coding and Agent workflows. Build a bounded task packet, select versioned evidence, maintain durable decisions without treating memory as authority, compress with source links, measure retrieval and cache behavior, and verify permissions, privacy, quality, latency, cost, and rollback.