AI Solutions
AI Solutions
AI for Work

Search across silos. Automate workflows. Orchestrate AI agents. Govern with confidence.

learn more
features
Enterprise SearchIntelligent OrchestratorPre-Built AI AgentsAdmin ControlsAI Agent Builder
Departments
SalesMarketingEngineeringLegalFinance
PRE-BUILT accelerators
HRITRecruiting
AI for Service

Leverage Agentic capabilities to empower customers and create personalized experiences.

learn more
features
AI agentsAgent AI AssistanceAgentic Contact CenterQuality AssuranceProactive Outreach
PRE-BUILT accelerators
RetailBankingHealthcare
AI for Process

Streamline knowledge-intensive business processes with autonomous AI agents.

learn more
features
Process AutomationAI Analytics + MonitoringPre-built Process Templates
Use Cases
Zero-Touch IT Operations Management
Top Resources
Scaling AI: practical insights
from AI leaders
AI use cases: insights from AI's leading decision makers
Beyond AI islands: how to fully build an enterwise-wide AI workforce
QUICK LINKS
About Kore.aiCustomer StoriesPartnersResourcesBlogWhitepapersDocumentationAnalyst RecognitionGet supportCommunityAcademyCareersContact Us
Agent Platform
Agent Platform
Agent Platform

Your strategic enabler for enterprise AI transformation.

learn more
FEATURES
Multi-Agent Orchestration
AI Engineering Tools
Search + Data AI
AI Security + Governance
No-Code + Pro-Code Tools
Integrations
GET STARTED
AI for WorkAI for ServiceAI for ProcessAgent Marketplace
LEARN + DISCOVER
About Kore.aiCustomer StoriesPartnersResource HubBlogWhitepapersAI Research ReportsNewsroomAnalyst RecognitionDocumentationGet supportAcademy
GET INVOLVED
AI PulseEventsCommunityCareersContact Us
upcoming event

CCW Berlin brings together international experts, visionary speakers, and leading companies to explore the future of customer experience, AI, and digital transformation in a dynamic blend of congress and exhibition

Berlin
4 Feb
register
Recent AI Insights
The AI productivity paradox: why employees are moving faster than enterprises
The AI productivity paradox: why employees are moving faster than enterprises
AI INSIGHT
12 Jan 2026
The Decline of AI Agents and Rise of Agentic Workflows
The Decline of AI Agents and Rise of Agentic Workflows
AI INSIGHT
01 Dec 2025
AI agents and tools: Empowering intelligent systems for real world impact
AI agents and tools: Empowering intelligent systems for real world impact
AI INSIGHT
12 Nov 2025
Agent Marketplace
More
More
Resources
Resource Hub
Blog
Whitepapers
Webinars
AI Research Reports
AI Glossary
Videos
AI Pulse
Generative AI 101
Responsive AI Framework
CXO Toolkit
support
Documentation
Get support
Submit RFP
Academy
Community
COMPANY
About us
Leadership
Customer Stories
Partners
Analyst Recognition
Newsroom
Events
Careers
Contact us
Agentic AI Guides
forrester cx wave 2024 Kore at top
Kore.ai named a leader in The Forrester Wave™: Conversational AI for Customer Service, Q2 2024
Generative AI 101
CXO AI toolkit for enterprise AI success
upcoming event

CCW Berlin brings together international experts, visionary speakers, and leading companies to explore the future of customer experience, AI, and digital transformation in a dynamic blend of congress and exhibition

Berlin
4 Feb
register
Talk to an expert
Not sure which product is right for you or have questions? Schedule a call with our experts.
Request a Demo
Double click on what's possible with Kore.ai
Sign in
Get in touch
Background Image 1
Blog
Conversational AI
Visualise & Discover RAG Data

Visualise & Discover RAG Data

Published Date:
July 10, 2024
Last Updated ON:
November 24, 2025

RAGxplorer is a web-based GUI for importing, chunking, mapping & exploring semantic similarity.

Introduction

RAGxplorer is a work in progress, and a number of improvements came to mind. But it is a very timely development, considering the necessary focus that data centric AI should enjoy.

In the past, I have referred to the four dimensions of data;

  1. Data Discovery
  2. Data Design
  3. Data Development
  4. Data Delivery

Data Delivery has received much attention; data delivery can be described as the methods and means by which propriety data is introduced to the LLM for inference. This can be done via RAG, fine-tuning, prompt-tuning etc.

RAG

In essence Retrieval-Augmented Generation is the process of leveraging the in-context learning abilities of LLMs by injecting prompts with a contextual reference at inference.

Creating a RAG pipeline consists of four basic steps;

  1. Chunking
  2. Creating Embeddings (embeddings is a representation of the semantic relation between phrases.)
  3. Extracting Chunks which is semantically related to a query.
  4. Injecting a prompt with this contextual reference (chunk).

The idea of visual inspection also helps to see which chunks are nearby and were not retrieved; with the tooltip provides the actual text.

— Gabriel Chua

RAGxplorer

RAGxplorer is an interactive tool for visualising document chunks in the embedding space. Designed to explore the semantic relation between chunks, and mapping which chunks are linked to a specific user input.

The image shows the landing page of RAGxplorer, with the file upload section. Under configuration the chuck size can be set, with the chunk overlap size. The embedding models available are all-MiniLM-L6-v2 and text-embedding-ada-002.

RAGxplorer empowers users to upload documents, transforming them into segmented formats (chunks) compatible with RAG applications.




This screen recording show the chunks distributed and mapped according to semantic similarity. When the cursor hovers over a node, the chucks text is displayed.

RAGxplorer facilitates the visualisation of these segments in an embedding space. This visual representation enhances comprehension of the interrelationships among different segments and their connections to specific queries, offering valuable insights into the functionality of RAG-based systems.

While visually inspecting chunks, if it seems that the chunks are odd or not similar (and vice-versa, that the related ones are not retrieved), it may mean that users should try a different chunking strategy or embedding model.

— Gabriel Chua

In the RAG framework, documents are segmented into parts, referred to as chunks, for efficient searching.

Relevant chunks are then provided to the Language Model (LLM) as supplementary context.

The “Chunk Size” corresponds to the number of tokens within each chunk, while “Chunk Overlap” signifies the number of tokens shared between consecutive chunks to preserve contextual information. It’s worth noting that a single word typically consists of 3 to 4 tokens.

Features

  • Document Upload: Users can upload PDF documents that they wish to analyse.
  • Chunk Configuration: Options to configure the chunk size and overlap, offering flexibility in how the document is processed.
  • Vector Database Creation: Builds a vector database from the uploaded document for efficient retrieval and visualisation.
  • Query Expansion: Generates sub-questions and hypothetical answers to enhance the retrieval process.
  • Interactive Visualisation: Utilises Plotly for dynamic and informative visual representations of the data.

Prompt Injection

The image below from the OpenAI playground shows how the context from the chunk can be injected into a prompt. So the instruction is defined, with the question and the context which serves as reference.

The generated answer is accurate and succinct due to available context to perform in-context learning.

And the response from the OpenAI model:

Yes, poison gas was used during World War I on an unprecedented scale, along with tanks, aircraft, and other weapons. However, the entry of fresh American troops and resources ultimately helped to tip the balance in favor of the Allies and contributed to the eventual crumbling of the Central Powers.

Suggested Read: AI Agents and RAG = Agentic retrieval

Conclusion

Visualising data in this fashion will not replace detailed inspection of training data. Rather data visualisation should be seen as a higher order of data discovery, and an avenue to perform spot-checks.

Data visualisation is useful to find outliers, and areas which are data-sparce or under represented. These areas might be user intents which needs to be augmented, or it might be considered out of domain.

The generation of sub-questions is an ideal approach to refine responses to user questions. And can also serve as a form of disambiguation, hence allowing the user to select a question when prompted with a disambiguation menu.

Data exploration will become increasingly important, together with making use of a data productivity suite which act as a latent space. One way of describing a latent space, is where data is compressed in order for patterns and trends to emerge.

This is certainly the case with RAGxplorer.

Explore Kore.ai Search and Data AI
Contact us
Share
Link copied
authors
Cobus Greyling
Cobus Greyling
Chief Evangelist
Forrester logo at display.
Kore.ai named a leader in the Forrester Wave™ Cognitive Search Platforms, Q4 2025
Access Report
Gartner logo in display.
Kore.ai named a leader in the Gartner® Magic Quadrant™ for Conversational AI Platforms, 2025
Access Report
Stay in touch with the pace of the AI industry with the latest resources from Kore.ai

Get updates when new insights, blogs, and other resources are published, directly in your inbox.

Subscribe
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Recent Blogs

View all
Agentic AI in Retail: Transforming Customer Experience & Operations 
January 23, 2026
Agentic AI in Retail: Transforming Customer Experience & Operations 
Top Glean Alternatives (2026 Guide)
January 23, 2026
Top Glean Alternatives (2026 Guide)
AI Agents in 2026: From Hype to Enterprise Reality
January 16, 2026
AI Agents in 2026: From Hype to Enterprise Reality
Start using an AI agent today

Browse and deploy our pre-built templates

Marketplace
Reimagine your business

Find out how Kore.ai can help you today.

Talk to an expert
Background Image 4
Background Image 9
You are now leaving Kore.ai’s website.

‍

Kore.ai does not endorse, has not verified, and is not responsible for, any content, views, products, services, or policies of any third-party websites, or for any verification or updates of such websites. Third-party websites may also include "forward-looking statements" which are inherently subject to risks and uncertainties, some of which cannot be predicted or quantified. Actual results could differ materially from those indicated in such forward-looking statements.



Click ‘Continue’ to acknowledge the above and leave Kore.ai’s website. If you don’t want to leave Kore.ai’s website, simply click ‘Back’.

CONTINUEGO BACK
Reimagine your enterprise with Kore.ai
English
Spanish
Spanish
Spanish
Spanish
Get Started
AI for WorkAI for ServiceAI for ProcessAgent Marketplace
Kore.ai agent platform
Platform OverviewMulti-Agent OrchestrationAI Engineering ToolsSearch and Data AIAI Security and GovernanceNo-Code and Pro-Code ToolsIntegrations
ACCELERATORS
BankingHealthcareRetailRecruitingHRIT
company
About Kore.aiLeadershipCustomer StoriesPartnersAnalyst RecognitionNewsroom
resources
DocumentationBlogWhitepapersWebinarsAI Research ReportsAI GlossaryVideosGenerative AI 101Responsive AI frameworkCXO Toolkit
GET INVOLVED
EventsSupportAcademyCommunityCareers

Let’s work together

Get answers and a customized quote for your projects

Submit RFP
Follow us on
© 2026 Kore.ai Inc. All trademarks are property of their respective owners.
Privacy PolicyTerms of ServiceAcceptable Use PolicyCookie PolicyIntellectual Property Rights
|
×