Agentic AI Apps
AI Solutions
Pre-built Applications

Ready-to-deploy applications across industries and functions.

AI for Banking
AI for Healthcare
AI for Retail
AI for IT
AI for HR
AI for Recruiting
Application Accelerators

Leverage pre-built AI agents, templates, and integrations from the Kore.ai Marketplace.

Kore.ai Marketplace
Pre-built agents
Templates
Integrations
Tailored Applications

Design and build applications on our Agent Platform using our enteprise modules.

Platform
Agent Platform

Your strategic enabler for enterprise AI transformation.

Learn more
Enterprise Modules
AI for Work
AI for Service
AI for Process
Top Resources
Scaling AI: practical insights
from AI leaders
AI use cases: insights from AI's leading decision makers
Beyond AI islands: how to fully build an enterwise-wide AI workforce
QUICK LINKS
About Kore.aiCustomer StoriesPartnersResourcesBlogWhitepapersDocumentationAnalyst RecognitionGet supportCommunityAcademyCareersContact Us
Agent Platform
Agent Platform
Agent Platform

Your strategic enabler for enterprise AI transformation.

learn more
PLATFORM MODULES
Multi-Agent Orchestration
AI Engineering Tools
Search + Data AI
AI Security + Governance
No-Code + Pro-Code Tools
Observability
Integrations
Enterprise Modules
For Service
AI AgentsAgent AI AssistanceAgentic Contact CenterQuality AssuranceProactive Outreach
For Work
Modules
Enterprise SearchIntelligent OrchestratorPre-Built AI AgentsAdmin ControlsAI Agent Builder
Departments
SalesMarketingEngineeringLegalFinance
For Process
Process AutomationAI Analytics + MonitoringPre-built Process Templates
upcoming event

Join the first generation of leaders who are designing, governing, and leading the truly intelligent organization.

Orlando
12 May
register
Recent AI Insights
What's new in AI for Work: features that drive enterprise productivity
What's new in AI for Work: features that drive enterprise productivity
AI INSIGHT
20 Feb 2026
Parallel Agent Processing
Parallel Agent Processing
AI INSIGHT
16 Jan 2026
The AI productivity paradox: why employees are moving faster than enterprises
The AI productivity paradox: why employees are moving faster than enterprises
AI INSIGHT
12 Jan 2026
Agent Marketplace
More
More
Resources
Resource Hub
Blog
Whitepapers
Webinars
AI Research Reports
AI Glossary
Videos
AI Pulse
Generative AI 101
Responsive AI Framework
CXO Toolkit
Private equity
support
Documentation
Get support
Submit RFP
Academy
Community
COMPANY
About us
Leadership
Customer Stories
Partners
Analyst Recognition
Newsroom
Events
Careers
Contact us
Agentic AI Guides
forrester cx wave 2024 Kore at top
Kore.ai named a leader in The Forrester Wave™: Conversational AI for Customer Service, Q2 2024
Generative AI 101
CXO AI toolkit for enterprise AI success
upcoming event

Join the first generation of leaders who are designing, governing, and leading the truly intelligent organization.

Orlando
12 May
register
Talk to an expert
Not sure which product is right for you or have questions? Schedule a call with our experts.
Request a Demo
Double click on what's possible with Kore.ai
Sign in
Get in touch
Background Image 1
Blog
Conversational AI
Self-Reflective Retrieval-Augmented Generation (SELF-RAG)

Self-Reflective Retrieval-Augmented Generation (SELF-RAG)

Published Date:
October 4, 2024
Last Updated ON:
November 20, 2025

The SELF-RAG framework trains a single arbitrary language model to adaptively retrieve passages on-demand. To generate and reflect on retrieved passages and on own generations using special tokens, called reflection tokens.

  1. It is interesting to note, that RAG is following very much the same trajectory as prompt engineering. RAG started off as a simple yet effective concept which consists of prompt injection with contextual reference data.
  2. The primary objective of RAG is to leverage ICL (In-Context Learning) capabilities of LLMs.
    Complexity and efficiency are being introduced to RAG. Retrieval does not take place by default, and a process of triage takes place to determine if the LLM can fulfil the user request.
  3. Efficiency and accuracy trade-off. There is always a balance to be found between efficiency and accuracy. Accuracy at the cost of efficiency negatively impacts user experience and practical use-cases. Efficiency at the cost of accuracy leads to a misleading and inaccurate solution.
  4. Triaging user input to determine direct LLM inference or prompt injection via RAG requires a reference. In the case of SELF-RAG it is against a fine-tuned LLM making use of self-reflection.
  5. The principle of RAG triage can be applied in various forms. The most important aspect is the reference against which the decision is made to directly infer the question from an LLM, or make use of RAG. And in the case where RAG is used; being able to assess the quality and correctness of the response.
  6. Generative AI based applications can also include a wider consideration for triage…where other options apart from direct inference or RAG are available. For instance, human-in-the-loop, web search, multi-LLM orchestration, etc.
RAG triage diagram comparing efficiency versus accuracy and showing how user queries pass through a fine-tuned LLM to determine whether to retrieve documents or escalate to a larger model.
SELF-RAG framework for adaptive retrieval and reflection

Reflection tokens

Reflection tokens are categorised into retrieval and critique tokens to indicate the need for retrieval and its generation quality respectively.
SELF-RAG uses reflection tokens to decide the need for retrieval and to self-evaluate generation quality.
Generating reflection tokens makes the LM controllable during the inference phase, enabling it to tailor its behaviour to diverse task requirements.
The study shows that SELF-RAG significantly outperforms LLMs and also standard RAG approaches.

Steps in SELF-RAG

  1. The LLM generates text informed by retrieved passages.
  2. Criticise the output by learning to generate special tokens.
  3. These reflection tokens signal the need for retrieval or confirm the output’s relevance, support, or completeness.
  4. In contrast, common RAG approaches retrieve passages indiscriminately, without ensuring complete support from cited sources.
Diagram showing critic LLM workflows, including retrieval steps, augmented outputs, relevance scoring, and utility labels for generated text.
SELF-RAG retrieves, critiques, and generates text passages

Considering the image below…SELF-RAG learns to retrieve, critique and generate text passages to enhance overall generation quality, factuality, and verifiability.

Side-by-side comparison of Retrieval-Augmented Generation and Self-RAG, illustrating retrieval steps, parallel segment generation, critique scoring, and selection of the best answer.
Parallel inference steps in self-reflective RAG architecture

Some considerations

Additional inference and cost

SELF-RAG will introduce more overhead in terms of inference. Considering the image above, the self-reflective approach to RAG introduces more points of inference.

A first step of inference is performed, with three inference steps being performed in parallel. The three results are then compared and a winner is selected for RAG inference.

Out-of-domain

Also as can be seen in the image above, out-of-domain queries are recognised as such, and the request is not serviced via retrieval, but sent directly to the LLM inference.

Agentic RAG

Considering the image blow, the question needs to be asked…

With the complexity being introduced to the RAG process, are we not reaching a point where an agent-based RAG approach will work best? An approach LlamaIndex refers to as Agentic RAG.

Intents

There has been studies where intent-based routing has been used to triage user input for the correct treatment with in a generative AI framework. Intents are merely pre-defined use-case classes.

Diagram showing a user question routed to a top-level agent, followed by a Cohere reranker and multiple agents performing summarization, embeddings, and document processing.
Evolution from static prompts to Agentic RAG systems
Find the original study here
Share
Link copied
authors
Cobus Greyling
Cobus Greyling
Chief Evangelist
Forrester logo at display.
Kore.ai named a leader in the Forrester Wave™ Cognitive Search Platforms, Q4 2025
Access Report
Gartner logo in display.
Kore.ai named a leader in the Gartner® Magic Quadrant™ for Conversational AI Platforms, 2025
Access Report
Stay in touch with the pace of the AI industry with the latest resources from Kore.ai

Get updates when new insights, blogs, and other resources are published, directly in your inbox.

Subscribe
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Recent Blogs

View all
AI agents in retail: 12 proven use cases & examples (2026)
March 5, 2026
AI agents in retail: 12 proven use cases & examples (2026)
The end of manual AP: Zero-Touch invoice processing with AI for Process
February 20, 2026
The end of manual AP: Zero-Touch invoice processing with AI for Process
AI Agent governance: A practical guide to risk, trust, and compliance
February 20, 2026
AI Agent governance: A practical guide to risk, trust, and compliance
Accelerate time-to-value from AI

Find out how Kore.ai can help

Talk to an expert
Start using an AI agent today

Browse and deploy our pre-built templates

Marketplace
Background Image 4
Background Image 9
You are now leaving Kore.ai’s website.

‍

Kore.ai does not endorse, has not verified, and is not responsible for, any content, views, products, services, or policies of any third-party websites, or for any verification or updates of such websites. Third-party websites may also include "forward-looking statements" which are inherently subject to risks and uncertainties, some of which cannot be predicted or quantified. Actual results could differ materially from those indicated in such forward-looking statements.



Click ‘Continue’ to acknowledge the above and leave Kore.ai’s website. If you don’t want to leave Kore.ai’s website, simply click ‘Back’.

CONTINUEGO BACK
Agentic AI applications for the enterprise
English
Spanish
Spanish
Spanish
Spanish
Pre-Built Applications
BankingHealthcareRetailRecruitingHRIT
Kore.ai agent platform
Platform OverviewMulti-Agent OrchestrationAI Engineering ToolsSearch and Data AIAI Security and GovernanceNo-Code and Pro-Code ToolsIntegrations
 
AI for WorkAI for ServiceAI for ProcessAgent Marketplace
company
About Kore.aiLeadershipCustomer StoriesPartnersAnalyst RecognitionNewsroom
resources
DocumentationBlogWhitepapersWebinarsAI Research ReportsAI GlossaryVideosGenerative AI 101Responsive AI frameworkCXO Toolkit
GET INVOLVED
EventsSupportAcademyCommunityCareers

Let’s work together

Get answers and a customized quote for your projects

Submit RFP
Follow us on
© 2026 Kore.ai Inc. All trademarks are property of their respective owners.
Privacy PolicyTerms of ServiceAcceptable Use PolicyCookie PolicyIntellectual Property Rights
|
×