AIMLSAGA

rag

retrieval

knowledge-base

nlp

grounding

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

Patrick Lewis, Ethan Perez, Aleksandra Piktus et al.

2020

Large pre-trained language models have been shown to store factual knowledge implicitly in their parameters. However, this knowledge is static, opaque, and can be factually incorrect. We propose RAG — Retrieval-Augmented Generation — which combines parametric and non-parametric memory for language generation tasks.

Reading Level:

RAG: Retrieval-Augmented Generation

RAG combines a parametric memory (pre-trained LLM) with non-parametric memory (retrieved documents) through a differentiable retrieval mechanism.

Architecture

RAG-Sequence Model: $$p_{\text{RAG-Seq}}(y|x) \approx \sum_{z \in \text{top-}k} p_\eta(z|x) \prod_i^N p_\theta(y_i | x, z, y_{1:i-1})$$

RAG-Token Model: $$p_{\text{RAG-Token}}(y|x) \approx \prod_i^N \sum_{z \in \text{top-}k} p_\eta(z_i|x,y_{1:i-1}) \cdot p_\theta(y_i|x,z_i,y_{1:i-1})$$

Retriever: Dense Passage Retrieval (DPR)

Uses bi-encoder architecture:

Query encoder: $E_Q(x)$ (BERT-base)
Document encoder: $E_D(z)$ (BERT-base, separate weights)
Similarity: $p_\eta(z|x) \propto \exp(E_Q(x)^T E_D(z))$
FAISS index for approximate nearest-neighbor search over 21M Wikipedia passages

Generator: BART

seq2seq generator conditioned on retrieved passages: $$p_\theta(y_i | x, z, y_{1:i-1}) = \text{BART}([x; z]; \theta)$$

Key Results

Open MS-MARCO: 45.5 → 56.8 (EM)
Natural Questions: 44.5 SOTA
TriviaQA: 68.0 SOTA
End-to-end trainable — both retriever and generator updated jointly

What did this paper leave unanswered?

Join the gap discussion — identify research opportunities and connect with others building solutions.

XP Reward

+175

Earned after 10 seconds of reading

Key EquationsClick for Python code

RAG-Sequence marginalizes over retrieved documents for full output

p_{\text{RAG}}(y|x) \approx \sum_{z \in \text{top-}k} p_\eta(z|x) \prod_i p_\theta(y_i | x, z, y_{1:i-1})

Citation Graph

References (3)

Dense Passage Retrieval for Open-Domain Question Answering

Karpukhin et al. · 2020

BART: Denoising Sequence-to-Sequence Pre-training

Lewis et al. · 2020

FAISS: A library for efficient similarity search

Johnson et al. · 2019