Tag: rag
All the articles with the tag "rag".
Retrieval Augmented Generation or Long-Context LLMs
Published:This document summarizes the findings of a comprehensive study comparing Retrieval Augmented Generation (RAG) and Long-Context (LC) Large Language Models (LLMs) for processing lengthy contexts. The study benchmarks both approaches across various public datasets using recent LLMs (Gemini-1.5-Pro, GPT-4O, and GPT-3.5-Turbo). The key finding is that LC models, when resourced sufficiently, generally outperform RAG in terms of average performance. However, RAG maintains a significant cost advantage due to the reduced input length to the LLM. Based on these observations, the study introduces SELF-ROUTE, a method that intelligently routes queries to either RAG or LC based on model self-reflection, significantly reducing computational costs while maintaining performance comparable to LC. The findings provide guidance for building long-context applications utilizing both RAG and LC.
IntellAgent: A Multi-Agent Framework for Evaluating Conversational AI Systems
Published:IntellAgent, an open-source multi-agent framework, is presented as a novel solution for comprehensively evaluating conversational AI systems. It addresses limitations of existing methods by automating the generation of diverse, realistic, policy-driven scenarios using a graph-based policy model. The framework simulates interactions between user and chatbot agents, providing fine-grained performance diagnostics and actionable insights for optimization. IntellAgent's modular design promotes reproducibility and collaboration, bridging the gap between research and deployment. Its effectiveness is demonstrated through experiments comparing its results to those of established benchmarks like τ-bench.
StructRAG: Retrieval-Augmented Generation via Hybrid Information Structurization
Published:This comprehensive study explores the burgeoning field of prompt engineering, encompassing a wide array of techniques used to elicit desired outputs from Generative AI (GenAI) models, particularly focusing on large language models (LLMs).