What is Cross-Encoder?
Cross-Encoder is a ranking model architecture that jointly encodes a query and a candidate document or passage to produce a relevance score.
How It Works
A cross-encoder evaluates the query and passage together, allowing attention across both texts. This usually yields better relevance judgments than comparing independent embeddings, especially for nuanced questions, negation, constraints, and short passages. The cost is latency: every query-passage pair must be scored at request time. For that reason, cross-encoders are commonly used as rerankers after a faster retriever such as BM25, dense retrieval, or hybrid search has produced a manageable candidate set.
Key Characteristics
- Jointly reads the query and candidate passage before scoring relevance
- More precise than bi-encoders for many reranking tasks
- Computationally expensive because each query-document pair is evaluated separately
- Useful for handling negation, constraints, and subtle wording differences
- Typically applied to top-k candidates rather than the entire corpus
Common Use Cases
- Reranking top 50 dense-retrieval results before sending context to an LLM
- Improving RAG answers where first-stage retrieval returns noisy candidates
- Ranking policy passages against detailed compliance questions
- Scoring query-document pairs for retrieval evaluation
- Combining BM25 and vector candidates into a final ordered list
Example
Loading code...Frequently Asked Questions
Why are cross-encoders more accurate than bi-encoders?
They jointly process the query and passage, so the model can directly compare constraints, entities, and wording before producing a relevance score.
Why not use a cross-encoder for all retrieval?
It is too expensive for large corpora because every query would need to be paired with every document or chunk at request time.
Where does a cross-encoder fit in RAG?
It usually reranks a candidate set produced by BM25, dense retrieval, or hybrid retrieval before final context is assembled.
Can an LLM act as a cross-encoder reranker?
An LLM can score or compare passages, but dedicated cross-encoder rerankers are often cheaper and more consistent for high-volume retrieval.