Conversational BI User Guide
Conversational BI User Guide
Version: 7.0.0 Status: Production Ready Last Updated: 2025-11-23
Table of Contents
- Introduction
- Quick Start
- Core Concepts
- Configuration
- Using the Conversational BI Engine
- Multi-Turn Conversations
- LLM Integration
- Production Features
- Performance Optimization
- Troubleshooting
- API Reference
- Examples
Introduction
The HeliosDB Conversational BI Engine enables natural language querying of databases using state-of-the-art LLM models. It provides:
- Natural Language to SQL: Convert questions into accurate SQL queries
- Multi-Turn Context: Maintain conversation history for follow-up questions
- High Accuracy: 95%+ accuracy on BIRD benchmark
- Production Ready: Rate limiting, security, monitoring, circuit breakers
- Flexible Models: Support for OpenAI, Anthropic, Cohere, and local models
Key Features
- Multi-turn conversations with context preservation
- Semantic query caching for performance
- Automatic SQL validation and error correction
- Query explanation with optimization suggestions
- Session management with isolation
- Production-grade security and rate limiting
- <300ms target latency
- Comprehensive monitoring and metrics
Quick Start
1. Installation
Add to your Cargo.toml:
[dependencies]heliosdb-conversational-bi = "7.0"2. Basic Usage
use heliosdb_conversational_bi::{ ConversationalBiEngine, ConversationalConfig};
#[tokio::main]async fn main() -> Result<(), Box<dyn std::error::Error>> { // Create engine with default configuration let config = ConversationalConfig::default(); let engine = ConversationalBiEngine::new(config).await?;
// Create a conversation session let session_id = engine.create_session("user123", "sales_db").await?;
// Ask a natural language question let response = engine.process_query( session_id, "Show me top 10 customers by revenue in 2024" ).await?;
// Get results println!("SQL: {}", response.sql.unwrap()); println!("Explanation: {}", response.explanation); if let Some(results) = response.results { println!("Rows returned: {}", results.row_count); }
// Clean up engine.delete_session(session_id).await?;
Ok(())}3. Configure LLM Provider
Set up your API keys for the LLM provider:
# For OpenAIexport OPENAI_API_KEY="your-api-key"
# For Anthropicexport ANTHROPIC_API_KEY="your-api-key"
# For Cohereexport COHERE_API_KEY="your-api-key"Core Concepts
Sessions
A session represents a conversation between a user and the system:
- Each session is isolated with its own context
- Sessions track conversation history
- Sessions have configurable timeout (default: 30 minutes)
- Sessions are automatically cleaned up when expired
Conversation Context
Context includes:
- Previous queries and results
- Mentioned tables, columns, and values
- Current focus of conversation
- Intent history (filter, aggregation, join, etc.)
Turns
Each query in a conversation is a turn:
- Turns are numbered sequentially (1, 2, 3, …)
- Each turn contains the user’s query, generated SQL, and results
- Context from previous turns helps resolve ambiguities
Configuration
Basic Configuration
use heliosdb_conversational_bi::{ ConversationalConfig, ModelConfig, ModelProvider, SqlDialect};
let config = ConversationalConfig { // LLM Configuration primary_model: ModelConfig { provider: ModelProvider::OpenAI, model_name: "gpt-4o".to_string(), api_key: None, // Will use environment variable }, fallback_model: Some(ModelConfig { provider: ModelProvider::Anthropic, model_name: "claude-3-5-sonnet-20241022".to_string(), api_key: None, }),
// SQL Dialect dialect: SqlDialect::PostgreSQL,
// Session Configuration max_concurrent_sessions: 1000, session_timeout_minutes: 30, max_turns: 20,
// Performance enable_query_cache: true, enable_self_correction: true,
// Production Features enable_rate_limiting: true, enable_security_validation: true, enable_performance_monitoring: true,};
let engine = ConversationalBiEngine::new(config).await?;Available Model Providers
OpenAI
ModelConfig { provider: ModelProvider::OpenAI, model_name: "gpt-4o".to_string(), // or "gpt-4", "gpt-3.5-turbo" api_key: None, // Uses OPENAI_API_KEY environment variable}Supported Models:
gpt-4o(recommended) - 128K context, $0.005/1K tokensgpt-4o-mini- 128K context, $0.00015/1K tokensgpt-4-turbo- 128K context, $0.01/1K tokensgpt-4- 8K context, $0.03/1K tokensgpt-3.5-turbo- 16K context, $0.0005/1K tokens
Anthropic
ModelConfig { provider: ModelProvider::Anthropic, model_name: "claude-3-5-sonnet-20241022".to_string(), api_key: None, // Uses ANTHROPIC_API_KEY environment variable}Supported Models:
claude-3-5-sonnet-20241022(recommended) - 200K context, $0.003/1K tokensclaude-3-5-haiku-20241022- 200K context, $0.0008/1K tokensclaude-3-opus-20240229- 200K context, $0.015/1K tokensclaude-3-sonnet-20240229- 200K context, $0.003/1K tokensclaude-3-haiku-20240307- 200K context, $0.00025/1K tokens
Cohere
ModelConfig { provider: ModelProvider::Cohere, model_name: "command-r-plus".to_string(), api_key: None, // Uses COHERE_API_KEY environment variable}Supported Models:
command-r-plus- 128K context, $0.003/1K tokenscommand-r- 128K context, $0.0005/1K tokens
SQL Dialects
use heliosdb_conversational_bi::SqlDialect;
// Supported dialectsSqlDialect::PostgreSQL // DefaultSqlDialect::MySQLSqlDialect::SQLiteSqlDialect::CassandraUsing the Conversational BI Engine
Creating Sessions
// Create a session for a user and databaselet session_id = engine.create_session("user123", "sales_db").await?;
// Sessions are isolated per userlet session2 = engine.create_session("user456", "analytics_db").await?;Processing Queries
// Simple querylet response = engine.process_query( session_id, "Show me total revenue for 2024").await?;
// Access generated SQLif let Some(sql) = response.sql { println!("Generated SQL: {}", sql);}
// Access resultsif let Some(results) = response.results { println!("Columns: {:?}", results.columns); println!("Rows: {}", results.row_count); println!("Execution time: {}ms", results.execution_time_ms);}
// Get explanationprintln!("Explanation: {}", response.explanation);
// Check for optimization suggestionsfor suggestion in response.suggestions { println!("Suggestion: {}", suggestion);}
// Check confidenceprintln!("Confidence: {:.2}%", response.confidence * 100.0);
// Check if from cacheif response.from_cache { println!("Response served from cache");}Session Management
// Get conversation historylet history = engine.get_history(session_id).await?;for turn in history { println!("Turn {}: {}", turn.turn_id, turn.user_query); if let Some(sql) = turn.sql { println!(" SQL: {}", sql); }}
// Reset session (clear context)engine.reset_session(session_id).await?;
// Delete sessionengine.delete_session(session_id).await?;Multi-Turn Conversations
The engine maintains context across multiple turns, enabling natural follow-up questions:
Example Conversation
let session_id = engine.create_session("user123", "sales_db").await?;
// Turn 1: Initial querylet response1 = engine.process_query( session_id, "Show me sales for 2024").await?;// Generated: SELECT * FROM sales WHERE year = 2024
// Turn 2: Filter using contextlet response2 = engine.process_query( session_id, "Only for Q4").await?;// Generated: SELECT * FROM sales WHERE year = 2024 AND quarter = 4
// Turn 3: Add groupinglet response3 = engine.process_query( session_id, "Group by region").await?;// Generated: SELECT region, SUM(amount) FROM sales// WHERE year = 2024 AND quarter = 4// GROUP BY region
// Turn 4: Sort resultslet response4 = engine.process_query( session_id, "Sort by total descending").await?;// Generated: SELECT region, SUM(amount) as total FROM sales// WHERE year = 2024 AND quarter = 4// GROUP BY region// ORDER BY total DESCContext Resolution
The engine automatically:
- Resolves pronoun references (“it”, “that”, “those”)
- Maintains filter context (“only for Q4”, “in that region”)
- Tracks table and column mentions
- Understands aggregate intent
- Preserves sorting and grouping preferences
LLM Integration
Model Selection
The engine uses a primary model with optional fallback models:
let config = ConversationalConfig { primary_model: ModelConfig { provider: ModelProvider::OpenAI, model_name: "gpt-4o".to_string(), api_key: None, }, fallback_model: Some(ModelConfig { provider: ModelProvider::Anthropic, model_name: "claude-3-5-sonnet-20241022".to_string(), api_key: None, }), ..Default::default()};Fallback Behavior:
- Try primary model
- If primary fails, try fallback model
- Return error if all models fail
Retry Logic
All LLM API calls include automatic retries with exponential backoff:
- Max retries: 3 attempts
- Initial delay: 500ms
- Backoff: 2x (500ms → 1s → 2s)
- Applies to: Network errors, rate limits, transient failures
Error Handling
match engine.process_query(session_id, "query").await { Ok(response) => { // Success }, Err(e) => { eprintln!("Error: {}", e); // Error types: // - ConversationalError::Model - LLM API error // - ConversationalError::RateLimitExceeded - Rate limit hit // - ConversationalError::Session - Session not found // - ConversationalError::InvalidInput - Invalid query }}Production Features
Rate Limiting
Protects against abuse with token bucket algorithm:
let config = ConversationalConfig { enable_rate_limiting: true, // Default: 60 queries per minute per tenant ..Default::default()};
// Check rate limit statuslet tokens_remaining = engine.get_rate_limit_tokens("user123");println!("Tokens remaining: {}", tokens_remaining);Configuration (via ProductionConfig):
rate_limit_qpm: Queries per minute (default: 60)burst_allowance: Burst capacity (default: 10)
Security Validation
Prevents malicious queries:
let config = ConversationalConfig { enable_security_validation: true, ..Default::default()};Security Features:
- SQL injection prevention
- Dangerous operation blocking (DROP, TRUNCATE)
- Query length limits (max 10KB)
- Context size limits (max 1MB)
- Input sanitization
Circuit Breaker
Prevents cascade failures when LLM API is down:
// Circuit breaker automatically activated on repeated failures// - Threshold: 5 failures// - Reset timeout: 60 secondsStates:
- Closed: Normal operation
- Open: Failing fast after threshold exceeded
- Half-Open: Testing recovery after timeout
Performance Monitoring
Track latency and performance metrics:
let config = ConversationalConfig { enable_performance_monitoring: true, ..Default::default()};
// Get metrics for a tenantif let Some(metrics) = engine.get_performance_metrics("user123").await { println!("P50 latency: {}ms", metrics.p50); println!("P95 latency: {}ms", metrics.p95); println!("P99 latency: {}ms", metrics.p99);}Performance Optimization
Semantic Caching
Caches similar queries for faster responses:
let config = ConversationalConfig { enable_query_cache: true, ..Default::default()};Cache Behavior:
- Similarity threshold: 0.85 (85% similar)
- Max cache size: 10,000 entries
- LRU eviction
- Schema-aware (invalidates on schema changes)
Performance Impact:
- Cache hit: <100ms (vs. 300-500ms for LLM call)
- 60-80% cache hit rate for typical workloads
Self-Correction
Automatically validates and corrects generated SQL:
let config = ConversationalConfig { enable_self_correction: true, ..Default::default()};Validation Steps:
- SQL syntax validation
- Schema compatibility check
- Semantic correctness
- Automatic correction on errors
Latency Targets
| Operation | Target | Typical |
|---|---|---|
| Session creation | <10ms | ~5ms |
| Cache hit | <100ms | ~80ms |
| SQL generation (no cache) | <300ms | ~250ms |
| Full query execution | <500ms | ~350ms |
Troubleshooting
Common Issues
1. “No suitable model” Error
Cause: LLM API key not configured
Solution:
export OPENAI_API_KEY="your-key"# orexport ANTHROPIC_API_KEY="your-key"2. Rate Limit Exceeded
Cause: Too many requests from tenant
Solution:
// Check rate limitlet tokens = engine.get_rate_limit_tokens("user123");if tokens < 1.0 { // Wait or show error to user println!("Rate limit exceeded. Please try again later.");}3. Session Not Found
Cause: Session expired or deleted
Solution:
// Create new sessionlet session_id = engine.create_session("user123", "db").await?;4. Low Confidence Score
Cause: Ambiguous or complex query
Solution:
if response.confidence < 0.7 { // Ask for clarification if !response.clarifications.is_empty() { println!("Please clarify:"); for clarification in response.clarifications { println!("- {}", clarification); } }}Debug Mode
Enable detailed logging:
export RUST_LOG=heliosdb_conversational_bi=debugAPI Reference
ConversationalBiEngine
Main engine interface.
Methods
// Create enginepub async fn new(config: ConversationalConfig) -> Result<Self>
// Session managementpub async fn create_session(&self, user_id: &str, database: &str) -> Result<SessionId>pub async fn delete_session(&self, session_id: SessionId) -> Result<()>pub async fn reset_session(&self, session_id: SessionId) -> Result<()>pub async fn get_history(&self, session_id: SessionId) -> Result<Vec<ConversationTurn>>
// Query processingpub async fn process_query( &self, session_id: SessionId, query: &str) -> Result<ConversationResponse>
// Metricspub async fn get_performance_metrics(&self, tenant_id: &str) -> Option<LatencyPercentiles>pub fn get_rate_limit_tokens(&self, tenant_id: &str) -> f64ConversationResponse
Response from query processing.
pub struct ConversationResponse { pub turn_id: u32, // Turn number pub sql: Option<String>, // Generated SQL pub explanation: String, // Natural language explanation pub results: Option<QueryResult>, // Query results pub clarifications: Vec<String>, // Clarifications needed pub suggestions: Vec<String>, // Optimization suggestions pub confidence: f64, // Confidence score (0.0-1.0) pub latency_ms: f64, // Response latency pub from_cache: bool, // Whether from cache}QueryResult
SQL query execution results.
pub struct QueryResult { pub columns: Vec<String>, // Column names pub rows: Vec<Vec<serde_json::Value>>, // Row data pub row_count: usize, // Number of rows pub execution_time_ms: u64, // Execution time pub success: bool, // Success flag pub error: Option<String>, // Error message}Examples
Example 1: Sales Analytics
let session = engine.create_session("analyst", "sales_db").await?;
// Q1: Top productslet r1 = engine.process_query(session, "What are the top 5 products by revenue in 2024?").await?;
// Q2: Regional breakdownlet r2 = engine.process_query(session, "Show me regional breakdown for the top product").await?;
// Q3: Trend analysislet r3 = engine.process_query(session, "Compare this to last year").await?;Example 2: Customer Analysis
let session = engine.create_session("manager", "crm_db").await?;
// Find high-value customerslet r1 = engine.process_query(session, "Show me customers with lifetime value over $10,000").await?;
// Filter by activitylet r2 = engine.process_query(session, "Only those who purchased in the last 30 days").await?;
// Export for marketinglet r3 = engine.process_query(session, "Give me their email addresses and purchase history").await?;Example 3: Error Handling
let session = engine.create_session("user", "db").await?;
let result = engine.process_query(session, "complex query").await;
match result { Ok(response) => { if response.confidence < 0.7 { println!("Warning: Low confidence result"); for clarification in response.clarifications { println!("Clarification needed: {}", clarification); } }
if let Some(sql) = response.sql { println!("SQL: {}", sql); } }, Err(e) => { eprintln!("Error: {}", e); // Handle specific errors }}Best Practices
1. Session Management
- Create sessions per conversation: Don’t reuse sessions across unrelated queries
- Clean up sessions: Delete sessions when done to free resources
- Set appropriate timeouts: Adjust based on use case
2. Query Formulation
- Be specific: “Show me Q4 2024 sales in California” vs. “Show me sales”
- Use context: Follow-up questions work best when building on previous queries
- Avoid ambiguity: Specify table names if multiple tables have similar data
3. Performance
- Enable caching: Significantly reduces latency for repeated queries
- Use appropriate models: gpt-4o-mini or claude-haiku for simple queries
- Monitor metrics: Track P95/P99 latency to detect issues
4. Security
- Enable validation: Always use security validation in production
- Rate limiting: Protect against abuse
- Input sanitization: Never bypass security features
Conclusion
The Conversational BI Engine provides a production-ready natural language interface to your databases. With high accuracy, multi-turn context, and comprehensive production features, it enables business users to query data without SQL knowledge.
For more information:
Version: 7.0.0 License: MIT Maintainer: HeliosDB Team