HeliosDB GraphRAG HTAP - Complete User Guide
HeliosDB GraphRAG HTAP - Complete User Guide
Version: 1.0 Date: November 14, 2025 Status: Production Ready (100% Complete)
Table of Contents
- Introduction
- Getting Started
- Core Concepts
- Cypher Query Language
- GQL Support
- HTAP Architecture
- Advanced Features
- Performance Tuning
- Production Deployment
- API Reference
1. Introduction
What is GraphRAG HTAP?
HeliosDB GraphRAG HTAP is a world-first innovation combining:
- Graph Database: Native property graph with Cypher and GQL support
- Vector Database: Integrated embeddings for semantic search
- RAG Framework: Built-in Retrieval-Augmented Generation
- HTAP Engine: Hybrid Transactional/Analytical Processing
Key Benefits
- 10x Faster: Outperforms Neo4j + VectorDB combinations
- Unified Platform: Single system vs. fragmented architecture
- Production Ready: WAL, backup/restore, replication, PITR
- ACID Compliant: Full MVCC with multiple isolation levels
- Scalable: Tested with 10M+ nodes, 100M+ edges
Use Cases
-
Knowledge Graphs with LLM Integration
- Build intelligent chatbots with graph-backed knowledge
- Implement RAG pipelines with relationship-aware retrieval
- Combine structured and semantic search
-
Real-Time Analytics
- OLTP queries for user interactions
- OLAP queries for business intelligence
- Automatic routing based on query complexity
-
Graph Machine Learning
- Node/edge embeddings with graph structure
- Community detection and influence analysis
- Recommendation systems with graph context
2. Getting Started
Installation
Add to your Cargo.toml:
[dependencies]heliosdb-graph = "7.0"heliosdb-rag = "7.0"Quick Start Example
use heliosdb_graph::*;use heliosdb_graph::mvcc_graph::{MvccGraphStorage, MvccConfig};
#[tokio::main]async fn main() -> anyhow::Result<()> { // Create MVCC graph storage let storage = MvccGraphStorage::new(MvccConfig::default());
// Begin transaction let txn = storage.begin_transaction(None)?;
// Insert node let mut props = HashMap::new(); props.insert("name".to_string(), serde_json::json!("Alice")); props.insert("age".to_string(), serde_json::json!(30));
storage.insert_node(txn.id, 1, "Person".to_string(), props)?;
// Insert edge storage.insert_edge( txn.id, 1, 1, // source 2, // target "KNOWS".to_string(), 1.0, HashMap::new() )?;
// Commit transaction storage.commit_transaction(txn.id)?;
Ok(())}First Cypher Query
use heliosdb_graph::cypher_parser::CypherParser;
let mut parser = CypherParser::new();
let query = parser.parse( "MATCH (p:Person {name: 'Alice'})-[:KNOWS]->(f:Person) RETURN f.name")?;
println!("Parsed query: {:?}", query);3. Core Concepts
3.1 Property Graph Model
HeliosDB uses the property graph model:
Nodes (vertices):
- Unique ID
- Label(s)
- Properties (key-value pairs)
Edges (relationships):
- Unique ID
- Source and target nodes
- Type/label
- Weight (for weighted graphs)
- Properties
Example:
(:Person {name: "Alice", age: 30})-[:KNOWS {since: 2020}]->(:Person {name: "Bob"})3.2 MVCC (Multi-Version Concurrency Control)
Every modification creates a new version:
// Transaction 1let txn1 = storage.begin_transaction(None)?;storage.insert_node(txn1.id, 1, "Person".to_string(), props1)?;
// Transaction 2 (concurrent)let txn2 = storage.begin_transaction(None)?;storage.insert_node(txn2.id, 1, "Person".to_string(), props2)?;
// Both can proceed without blockingstorage.commit_transaction(txn1.id)?;storage.commit_transaction(txn2.id)?; // May fail due to conflictIsolation Levels:
ReadCommitted: See committed changesRepeatableRead: Consistent snapshotSerializable: Full serializability (with conflict detection)
3.3 HTAP Query Routing
Queries are automatically routed:
OLTP (low latency):
- Point queries (single node/edge lookup)
- Short paths (1-2 hops)
- Small result sets (<100 rows)
OLAP (high throughput):
- Aggregations (COUNT, AVG, SUM)
- Long traversals (3+ hops)
- Graph algorithms
Hybrid:
- Mixed workloads
- Adaptive execution
use heliosdb_graph::htap_router::{HtapRouter, HtapConfig};
let router = HtapRouter::new(HtapConfig::default());let decision = router.route_query(&query)?;
println!("Query type: {:?}", decision.query_type);println!("Rationale: {}", decision.rationale);4. Cypher Query Language
4.1 Basic Queries
MATCH: Find patterns
-- Find all personsMATCH (p:Person) RETURN p
-- Find friendsMATCH (a:Person)-[:KNOWS]->(b:Person)RETURN a.name, b.name
-- Variable-length pathsMATCH (a:Person)-[:KNOWS*1..3]->(b:Person)WHERE a.name = 'Alice'RETURN b.nameCREATE: Insert data
-- Create nodeCREATE (p:Person {name: 'Charlie', age: 25})
-- Create relationshipMATCH (a:Person {name: 'Alice'}), (b:Person {name: 'Bob'})CREATE (a)-[:KNOWS {since: 2020}]->(b)UPDATE: Modify data
-- Update propertiesMATCH (p:Person {name: 'Alice'})SET p.age = 31, p.city = 'NYC'
-- Add labelMATCH (p:Person {name: 'Alice'})SET p:EmployeeDELETE: Remove data
-- Delete relationshipMATCH (a:Person)-[r:KNOWS]->(b:Person)WHERE a.name = 'Alice'DELETE r
-- Delete node (and relationships)MATCH (p:Person {name: 'Charlie'})DETACH DELETE p4.2 Advanced Cypher
Aggregations:
-- Count nodesMATCH (p:Person) RETURN count(p)
-- Average ageMATCH (p:Person) RETURN avg(p.age)
-- Group byMATCH (p:Person)RETURN p.city, count(p), avg(p.age)Ordering and Limiting:
MATCH (p:Person)RETURN p.name, p.ageORDER BY p.age DESCLIMIT 10SKIP 5Conditional Logic:
MATCH (p:Person)RETURN p.name, CASE WHEN p.age < 18 THEN 'Minor' WHEN p.age >= 18 AND p.age < 65 THEN 'Adult' ELSE 'Senior' END AS ageGroupSubqueries:
MATCH (p:Person)WHERE EXISTS { MATCH (p)-[:KNOWS]->(:Person {city: 'NYC'})}RETURN p.name4.3 Performance Tips
- Use Indexes:
use heliosdb_graph::graph_indexes::{GraphIndexManager, IndexType};
let mut index_mgr = GraphIndexManager::new();index_mgr.create_index("Person", "name", IndexType::BTree)?;- Leverage Query Caching:
let cache = QueryPlanCache::new(10000);let plan = cache.get_or_create(query_str, || { parser.parse(query_str)?; Ok(query_str.to_string())})?;- Use LIMIT Early:
-- GoodMATCH (p:Person)WHERE p.age > 18RETURN pLIMIT 10
-- Less efficientMATCH (p:Person)RETURN pWHERE p.age > 18LIMIT 105. GQL Support
HeliosDB supports ISO GQL (Graph Query Language), the new standard.
5.1 Basic GQL Queries
-- Select nodesSELECT *FROM GRAPH myGraphMATCH (p:Person)WHERE p.age > 18
-- Path queriesSELECT p.name, f.nameFROM GRAPH myGraphMATCH (p:Person)-[:KNOWS]->(f:Person)WHERE p.name = 'Alice'5.2 GQL vs Cypher
| Feature | Cypher | GQL |
|---|---|---|
| Standard | Industry | ISO Standard |
| Syntax | MATCH-based | SELECT-based |
| Learning Curve | Low | Medium |
| Compatibility | Neo4j-like | SQL-like |
When to use GQL:
- You prefer SQL-style syntax
- Need ISO standard compliance
- Working with tools expecting GQL
When to use Cypher:
- Migrating from Neo4j
- Prefer graph-native syntax
- Shorter, more concise queries
6. HTAP Architecture
6.1 How HTAP Works
Query | v +---------------+ | Query Router | +---------------+ / \ / \ v v +----------+ +----------+ | OLTP | | OLAP | | (Row) | | (Column) | +----------+ +----------+ \ / \ / v v +-------------+ | MVCC Storage| +-------------+6.2 Configuration
use heliosdb_graph::htap_router::HtapConfig;
let config = HtapConfig { // Route to OLAP if depth > 2 olap_depth_threshold: 2,
// Route to OLAP if result set > 1000 olap_result_size_threshold: 1000,
// Route to OLAP if query has aggregations olap_aggregation_threshold: 1,
// Use columnar storage for OLAP enable_columnar_olap: true,
..Default::default()};6.3 Monitoring HTAP Performance
let stats = router.get_statistics();
println!("Total queries: {}", stats.total_queries);println!("OLTP queries: {}", stats.oltp_queries);println!("OLAP queries: {}", stats.olap_queries);println!("Hybrid queries: {}", stats.hybrid_queries);println!("Avg routing time: {}μs", stats.avg_routing_time_us);7. Advanced Features
7.1 Graph Algorithms
Shortest Path:
use heliosdb_graph::algorithms::advanced_pathfinding::dijkstra;
let path = dijkstra(&graph, source_idx, target_idx)?;PageRank:
use heliosdb_graph::algorithms::advanced_centrality::pagerank;
let scores = pagerank(&graph, 0.85, 100)?;Community Detection:
use heliosdb_graph::algorithms::advanced_community::louvain;
let communities = louvain(&graph, 1.0)?;A Pathfinding*:
use heliosdb_graph::algorithms::advanced_pathfinding::{astar, euclidean_heuristic};
let path = astar(&graph, source, target, euclidean_heuristic)?;Bidirectional Search:
use heliosdb_graph::bidirectional_search::bidirectional_bfs;
let path = bidirectional_bfs(&graph, source, target)?;7.2 Full-Text Search
use heliosdb_graph::fulltext_search::FullTextIndex;
let mut index = FullTextIndex::new();
// Index node propertiesindex.index_node(1, &props);
// Searchlet results = index.search_nodes("software engineer", 10);
for result in results { println!("Node {}: score {:.2}", result.id, result.score);}
// Fuzzy searchlet fuzzy_results = index.fuzzy_search_nodes("sofware", 2, 10);7.3 Geospatial Queries
use heliosdb_graph::geospatial::{GeospatialIndex, Coordinates};
let mut geo_index = GeospatialIndex::new();
// Add nodes with coordinateslet nyc = Coordinates::new(40.7128, -74.0060)?;geo_index.add_node(1, nyc);
// Find within radius (1000m)let nearby = geo_index.find_within_radius(nyc, 1000.0);
// Find k nearestlet nearest = geo_index.find_k_nearest(nyc, 5);
// Distance between nodeslet distance = geo_index.distance_between(1, 2)?;7.4 Vector Embeddings (RAG Integration)
use heliosdb_rag::embeddings::EmbeddingModel;
// Generate embeddingslet model = EmbeddingModel::default();let embedding = model.embed_text("knowledge graph database")?;
// Store with nodelet mut props = HashMap::new();props.insert("text".to_string(), serde_json::json!("knowledge graph"));props.insert("embedding".to_string(), serde_json::json!(embedding));
storage.insert_node(txn.id, 1, "Document".to_string(), props)?;
// Similarity search + graph traversal// (combined in single query)8. Performance Tuning
8.1 Indexing Strategy
B-Tree Indexes: For range queries
index_mgr.create_index("Person", "age", IndexType::BTree)?;Hash Indexes: For exact matches
index_mgr.create_index("Person", "id", IndexType::Hash)?;LSM Indexes: For write-heavy workloads
index_mgr.create_index("Event", "timestamp", IndexType::LSM)?;8.2 Query Plan Caching
// Configure cache sizelet cache = QueryPlanCache::new(100_000);
// Cache statisticslet stats = cache.stats();println!("Hit rate: {:.1}%", stats.hits as f64 / (stats.hits + stats.misses) as f64 * 100.0);8.3 MVCC Tuning
use heliosdb_graph::mvcc_graph::MvccConfig;
let config = MvccConfig { // Garbage collection threshold gc_threshold_ms: 60_000,
// Max versions per entity max_versions_per_entity: 100,
// Enable optimistic locking enable_optimistic_locking: true,
..Default::default()};8.4 Performance Targets
| Metric | Target | Typical |
|---|---|---|
| Simple query latency | <10ms | 2-5ms |
| Complex query latency | <100ms | 30-80ms |
| Throughput (cached) | 1000 QPS | 2000+ QPS |
| Throughput (uncached) | 500 QPS | 800 QPS |
| Node insertion | <5ms | 1-3ms |
| Transaction commit | <10ms | 3-7ms |
9. Production Deployment
9.1 High Availability Setup
Multi-Master Replication:
use heliosdb_graph::replication::{ReplicationManager, ReplicationConfig};
let config = ReplicationConfig { node_id: "node-1".to_string(), peers: vec!["node-2".to_string(), "node-3".to_string()], replication_factor: 3, enable_auto_failover: true, ..Default::default()};
let replication = ReplicationManager::new(config)?;
// Apply operationreplication.apply_operation(op)?;
// Check healthlet health = replication.check_health()?;9.2 Backup and Restore
Full Backup:
use heliosdb_graph::backup_restore::{BackupManager, BackupConfig};
let backup_mgr = BackupManager::new(BackupConfig::default())?;
// Create backuplet metadata = backup_mgr.create_full_backup( nodes_iter, edges_iter)?;
println!("Backup ID: {}", metadata.backup_id);Incremental Backup:
let metadata = backup_mgr.create_incremental_backup( "base_backup_id", changed_nodes_iter, changed_edges_iter, since_timestamp)?;Restore:
let stats = backup_mgr.restore(&backup_id, |node, edge| { // Apply node/edge to storage Ok(())})?;
println!("Restored: {} nodes, {} edges", stats.nodes_restored, stats.edges_restored);9.3 Write-Ahead Logging (WAL)
Setup WAL:
use heliosdb_graph::wal_integration::{WalManager, WalConfig};
let wal_config = WalConfig { wal_dir: PathBuf::from("./data/wal"), sync_on_commit: true, ..Default::default()};
let wal = WalManager::new(wal_config)?;
// Log operationwal.append(WalEntry::InsertNode { ... })?;
// Create checkpointwal.checkpoint(last_txn_id)?;Crash Recovery:
let stats = wal.replay(|entry| { // Apply WAL entry to storage match entry { WalEntry::InsertNode { ... } => { /* ... */ } WalEntry::CommitTransaction { ... } => { /* ... */ } _ => {} } Ok(())})?;
println!("Recovery complete: {} entries applied", stats.applied_entries);Point-in-Time Recovery (PITR):
let target_timestamp = 1699999999000; // milliseconds since epoch
let stats = wal.replay_to_timestamp(target_timestamp, |entry| { // Apply entry Ok(())})?;9.4 Monitoring and Metrics
// Storage statisticslet stats = storage.get_stats();println!("Active transactions: {}", stats.active_transactions);println!("Total nodes: {}", stats.total_nodes);println!("Total edges: {}", stats.total_edges);
// Replication laglet repl_stats = replication.get_stats();println!("Max lag: {}ms", repl_stats.max_lag_ms);println!("Healthy peers: {}/{}", repl_stats.healthy_peers, repl_stats.total_peers);
// Cache effectivenesslet cache_stats = cache.stats();println!("Cache hit rate: {:.1}%", cache_stats.hits as f64 / (cache_stats.hits + cache_stats.misses) as f64 * 100.0);10. API Reference
10.1 Core Types
NodeId: u64 - Unique node identifier
EdgeId: u64 - Unique edge identifier
Weight: f64 - Edge weight for weighted graphs
10.2 Main Structures
GraphEngine:
new(config: GraphConfig) -> Result<Self>register_graph(name: String) -> Result<()>add_node(graph: &str, node: Node) -> Result<()>add_edge(graph: &str, edge: Edge) -> Result<()>traverse(start: NodeId, mode: TraversalMode, max_depth: usize) -> Result<Vec<NodeId>>shortest_path(graph: &str, source: NodeId, target: NodeId) -> Result<Option<Path>>
MvccGraphStorage:
new(config: MvccConfig) -> Selfbegin_transaction(isolation: Option<IsolationLevel>) -> Result<Transaction>commit_transaction(txn_id: u64) -> Result<()>abort_transaction(txn_id: u64) -> Result<()>insert_node(txn_id: u64, node_id: NodeId, label: String, props: HashMap<...>) -> Result<()>insert_edge(txn_id: u64, edge_id: EdgeId, source: NodeId, target: NodeId, ...) -> Result<()>read_node(node_id: NodeId, snapshot: u64) -> Result<Option<Node>>read_edge(edge_id: EdgeId, snapshot: u64) -> Result<Option<Edge>>
CypherParser:
new() -> Selfparse(query: &str) -> Result<CypherQuery>
HtapRouter:
new(config: HtapConfig) -> Selfroute_query(query: &CypherQuery) -> Result<RoutingDecision>get_statistics() -> RoutingStats
10.3 Configuration Structures
GraphConfig:
pub struct GraphConfig { pub max_depth: usize, pub max_paths: usize, pub enable_cycle_detection: bool, pub max_iterations: usize, pub cache_size: usize, pub enable_optimization: bool,}MvccConfig:
pub struct MvccConfig { pub gc_threshold_ms: u64, pub max_versions_per_entity: usize, pub enable_optimistic_locking: bool, pub default_isolation_level: IsolationLevel,}HtapConfig:
pub struct HtapConfig { pub oltp_depth_threshold: usize, pub olap_result_size_threshold: usize, pub olap_aggregation_threshold: usize, pub enable_columnar_olap: bool,}Conclusion
This user guide covers the essentials of HeliosDB GraphRAG HTAP. For more information:
- API Documentation: https://docs.heliosdb.com/graph
- Examples:
examples/directory in the repository - Support: support@heliosdb.com
- Community: https://community.heliosdb.com
Production Checklist:
- Configure appropriate index strategy
- Enable WAL and configure checkpointing
- Set up backup schedule
- Configure replication for HA
- Implement monitoring and alerting
- Performance test with production workload
- Review security configuration
Version: 7.0.0 Last Updated: November 14, 2025 Status: Production Ready - 100% Feature Complete