This study explores how the RAG process can be decomposed, adding elements like multi-document retrieval, memory and personal information.
Intelligent and accurate knowledge source selection, together with synthesising multiple information sources into one coherent and succinct answer will become crucial.
One of the allures of RAG use to be the simplicity of implementation. However, considerable work is being done in terms of Agentic RAG, multi-document search and adding elements like conversational history and more.
Agentic RAG is where a hierarchy of agents are combined with a RAG implementation. The introduction of complexity and enhanced intelligence are inevitable.
Personalisation and maintaining context via conversational history are both important elements for stellar UX. UniMS-RAG prioritises these elements with their proposed RAG structure.
The study includes an algorithm for Inference with Self-refinement, coupled with the fact that RAG in general lends a great degree of inspectability and observability.
UniMS-RAG unifies the training process for planning, retrieving, and readingtasks, integrating them into a comprehensive framework.
Leveraging the power of large language models (LLMs) to harness external knowledge sources, UniMS-RAG enhances LLMs’ ability to seamlessly connect diverse sources in personalised knowledge-grounded dialogues.
This integration simplifies the traditionally separate retriever and reader training tasks, allowing adaptive evidence retrieval and relevance score evaluation in a unified manner.
The image below is an illustration of the proposed method called UniMS-RAG. Three optimisation tasks are carefully designed:
This is the process of creating a series of decisions on which specific knowledge source should be used, given the relationship between different sources.
Retrieve the top-n results from external databases according to the decisions.
Incorporate all retrieved knowledge into the final response generation.
This approach seeks to address personalised knowledge-grounded dialogue tasks in a multi-source setting, breaking the problem into three sub-tasks: knowledge source selection, knowledge retrieval, and response generation.
The proposed Unified Multi-Source Retrieval-Augmented Dialogue System (UniMS-RAG) uses Large Language Models (LLMs) as planners, retrievers, and readers simultaneously.
The framework introduces self-refinement during inference, refining responses using consistency and similarity scores.
Experimental results on two datasets show UniMS-RAG generates more personalised and factual responses, outperforming baseline models.