Tutorial: Analyzing Chat Data with Kura¶
Learn how to analyze RAG system chat data through a three-part tutorial series. Work with 560 real user queries to discover patterns and build production-ready classifiers.
Prerequisites¶
- Install dependencies from
pyproject.toml
- Set
OPENAI_API_KEY
to use OpenAI's GPT-4o-mini model - Download the tutorial dataset
Tutorial Series¶
Step 1. Cluster Conversations¶
Discover user query patterns through topic modeling and clustering. Learn to identify that three major topics account for 67% of queries, with artifact management appearing in 61% of conversations.
Step 2. Better Summaries¶
Transform generic summaries into domain-specific insights. Build custom summarization models that turn seven vague clusters into three actionable categories: Access Controls, Deployment, and Experiment Management.
Step 3. Building Classifiers¶
Convert clustering insights into production classifiers. Build real-time systems that automatically categorize new queries and scale your insights.
What You'll Learn¶
- Systematically analyze large volumes of user queries
- Build custom models for your specific domain
- Create production systems for automatic query classification
- Make data-driven decisions about system improvements
Ready to start? Begin with the first notebook below.