Skip to content

Tutorial: Analyzing Chat Data with Kura

Learn how to analyze RAG system chat data through a three-part tutorial series. Work with 560 real user queries to discover patterns and build production-ready classifiers.

Prerequisites

  • Install dependencies from pyproject.toml
  • Set OPENAI_API_KEY to use OpenAI's GPT-4o-mini model
  • Download the tutorial dataset

Download Dataset

Tutorial Series

Step 1. Cluster Conversations

Discover user query patterns through topic modeling and clustering. Learn to identify that three major topics account for 67% of queries, with artifact management appearing in 61% of conversations.

Step 2. Better Summaries

Transform generic summaries into domain-specific insights. Build custom summarization models that turn seven vague clusters into three actionable categories: Access Controls, Deployment, and Experiment Management.

Step 3. Building Classifiers

Convert clustering insights into production classifiers. Build real-time systems that automatically categorize new queries and scale your insights.

What You'll Learn

  • Systematically analyze large volumes of user queries
  • Build custom models for your specific domain
  • Create production systems for automatic query classification
  • Make data-driven decisions about system improvements

Ready to start? Begin with the first notebook below.