Domain Adaptation of LLMs Using RAFT, the Hybrid of RAG and SFT
We are witnessing the arrival of new and improved Large Language Models (LLMs) on a frequent basis, with every release, they have been improving their performance on previous benchmarks. Their impressive advancements in general-purpose tasks are being harnessed for practical applications across diverse industries. However, their knowledge is restricted to the corpus on which they are trained. If they are exposed to a new set of questions for which there is no underlying training data, they can potentially hallucinate.
For domain-specific adaptation, practitioners either use methods such as in-context learning through Retrieval-Augmented Generation (RAG) or Supervised Fine-Tuning (SFT). While these methods are promising, they come with their own pros and cons.
Retrieval-Augmented Generation (RAG)
Pros | Cons |
---|---|
Improved Accuracy with Relevant Information |
Reliance on Retrieval Quality |
Reduced Hallucination |
Information Loss |
Reduced Forgetting |
Increased Complexity |
Supervised Fine-Tuning (SFT)
Pros | Cons |
---|---|
Adaptability: |
Catastrophic Forgetting |
Improved Accuracy on Specific Tasks |
Overfitting |
Reduced Data Requirements and Faster Training |
Limited Interpretability |
While both RAG and SFT have promising outcomes, they still do not provide the best outcomes in enterprise settings. RAFT, which stands for Retrieval-Augmented Fine-Tuning, addresses the limitations of both RAG and SFT by combining them strategically.
How RAFT Works
- Fine-Tuning: RAFT leverages Supervised Fine-Tuning to train the LLM on a dataset specific to the target domain. This helps the model acquire domain-specific knowledge and improve its understanding of the domain’s language and concepts.
- Retrieval-Augmented Generation: RAFT incorporates RAG by allowing the LLM to access and reference relevant documents retrieved from the domain-specific data during task completion (like question answering). This ensures the response is grounded in factual information.
- Focus on Robustness: RAFT goes beyond simply incorporating retrieved documents. It trains the LLM to be robust against retrieval errors. This means the model can manage situations where some retrieved documents might be irrelevant or misleading. It learns to discern reliable information and utilize it effectively to generate accurate answers.
What Are the Benefits of RAFT Compared with RAG and Supervised Fine-Tuning?
- Overcomes Limitations of RAG: RAG relies heavily on accurate document retrieval. RAFT addresses this by training the LLM to be less susceptible to retrieval errors, leading to more reliable responses even with imperfect retrieval.
- Improves Upon Supervised Fine-Tuning: While fine-tuning improves domain knowledge, it might neglect the open-ended nature of real-world scenarios where additional information might be available. RAFT allows the LLM to leverage retrieved documents during task completion, mimicking a real-world situation where one might need to consult additional resources.
- Enhanced Accuracy: By combining domain-specific knowledge with access to relevant information, RAFT has the potential to generate more accurate and informative responses compared with either RAG or Supervised Fine-Tuning alone.
- Reduced Forgetting: Unlike fine-tuning, RAFT avoids the issue of catastrophic forgetting, where the LLM loses general knowledge while specializing in a new domain.
Results
Conclusion
Overall, RAFT offers a promising approach for adapting LLMs to specialized domains. It combines the benefits of both Supervised Fine-Tuning and RAG, leading to potentially more accurate, robust, and informative responses.
Jayachandran Ramachandran
Jayachandran has over 25 years of industry experience and is an AI thought leader, consultant, design thinker, inventor, and speaker at industry forums with extensive...Read More
Don’t miss our next article!
Sign up to get the latest perspectives on analytics, insights, and AI.