AIMLSAGA

llm

open-source

rlhf

llama

fine-tuning

Llama 2: Open Foundation and Fine-Tuned Chat Models

Hugo Touvron, Louis Martin, Kevin Stone et al.

2023

We develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters. Our fine-tuned LLMs, called Llama 2-Chat, are optimized for dialogue use cases. Our models outperform open-source chat models on most benchmarks we tested.

Reading Level:

Llama 2 Architecture and Training

Llama 2 builds on the original Llama architecture with key improvements:

Architecture Changes

Grouped-Query Attention (GQA): 70B model uses GQA with 8 key-value heads instead of 70, reducing KV cache memory by 8.75×
SwiGLU Activation: FFN uses SwiGLU instead of ReLU: $\text{SwiGLU}(x, W, V, b, c) = \text{Swish}(xW + b) \odot (xV + c)$
RoPE Positional Embeddings: Rotary Position Embeddings for better length generalization
Context Length: Extended from 2048 to 4096 tokens

RLHF Training Pipeline

Supervised Fine-Tuning (SFT): 27,540 high-quality instruction samples
Reward Model Training: Two reward models (helpfulness + safety) trained on 1.4M human preference annotations
PPO/Rejection Sampling: Iterative refinement using Proximal Policy Optimization

Safety Innovations

Ghost Attention (GAtt): Condition generation on system prompt throughout conversation
Safety-helpfulness balance: Red-teaming with 350+ adversarial prompts per category

Benchmark Results

Model	MMLU	HumanEval	GSM8K	TruthfulQA
Llama-2-7B	45.3	12.8	14.6	33.3
Llama-2-13B	54.8	18.3	28.7	41.9
Llama-2-70B	68.9	29.9	56.8	44.9
GPT-3.5	70.0	48.1	57.1	47.0

What did this paper leave unanswered?

Join the gap discussion — identify research opportunities and connect with others building solutions.

XP Reward

+200

Earned after 10 seconds of reading

Key EquationsClick for Python code

Rotary Position Embedding (RoPE) — position-aware query/key encoding

f_{q,k}(x_m, m) = (W_{q,k} x_m) e^{im\theta}

Grouped Query Attention — reduces KV cache memory footprint

GQA(Q, K, V) = \text{Concat}_{g=1}^{G}\text{Attention}(Q_g, K_g, V_g)

Citation Graph

References (4)

Training language models to follow instructions with human feedback

Ouyang et al. · 2022

Constitutional AI: Harmlessness from AI Feedback

Bai et al. · 2022

Proximal Policy Optimization Algorithms

Schulman et al. · 2017

RoFormer: Enhanced Transformer with Rotary Position Embedding

Su et al. · 2021