This final post of the Adaptive RAG series explores methods that treat adaptive retrieval as a learned skill and explicitly teach models when to retrieve. We examine three paradigms in increasing order of sophistication.
This post introduces techniques that probe the LLM’s internal confidence and knowledge boundaries. We explore prompt-based confidence detection, consistency-based uncertainty estimation, and internal state analysis approaches to determine when retrieval is truly necessary.
Building on part 1’s exploration of naive RAG’s limitations, this post introduces adaptive retrieval frameworks and pre-generation retrieval decision-making methods that determine if retrieval is truly necessary.
Retrieval-Augmented Generation (RAG) isn’t a silver bullet. This post highlights the hidden costs associated with RAG and makes the case for a smarter, adaptive approach.
Learned embeddings often suffer from ’embedding collapse’, where they occupy only a small subspace of the available dimensions. This article explores the causes of embedding collapse, from two-tower models to GNN-based systems, and its impact on model scalability and recommendation quality. We discuss methods to detect collapse and examine recent solutions proposed by research teams at Visa, Facebook AI, and Tencent Ads to address this challenge.
This article provides an introduction to online advertising systems and explores research work that incorporates ads into the LLM responses to user queries of commercial nature.