Chain-of-Retrieval Augmented Generation (CoRAG)
Development: Introduced by Wang et al. (2025), CoRAG enables models to retrieve and reason over relevant information step by step before generating the final answer.
Problem: Conventional RAG methods typically perform a single retrieval step before generation, which limits their effectiveness for complex queries due to imperfect retrieval results.
Solution: CoRAG allows the model to dynamically reformulate queries based on the evolving state of information gathering. The approach:
- Uses rejection sampling to automatically generate intermediate retrieval chains
- Augments existing RAG datasets that only provide the correct final answer
- Employs various decoding strategies at test time to scale compute by controlling the length and number of sampled retrieval chains
Results: Experimental results show significant improvements, particularly in multi-hop question answering tasks, with more than 10 points improvement in Exact Match scores compared to strong baselines. CoRAG established state-of-the-art performance across diverse knowledge-intensive tasks on the KILT benchmark.