GraphRAG Quick Start
GraphRAG Quick Start
Overview
GraphRAG combines graph databases with Retrieval-Augmented Generation (RAG) to enable intelligent knowledge graph queries using natural language and LLM reasoning.
Key Concepts
What is GraphRAG?
- Graph reasoning: Leverage relationships and connections in data
- LLM integration: Use language models for semantic understanding
- Hybrid querying: Combine structured graph queries with semantic search
- Knowledge graphs: Build and query interconnected information
When to Use GraphRAG
- Complex relationship discovery
- Knowledge base systems
- Semantic search with relationships
- Entity resolution and linking
- Graph-based recommendation systems
Quick Start
1. Enable GraphRAG
-- Create graph databaseCREATE DATABASE knowledge_graph;
-- Enable GraphRAG featuresSET graphrag_enabled = true;SET graphrag_embedding_model = 'openai';2. Create Graph Structures
-- Create nodes (entities)CREATE TABLE entities ( id SERIAL PRIMARY KEY, name VARCHAR(256), type VARCHAR(100), -- Person, Place, Organization, etc. description TEXT, embedding vector(1536) -- Vector embedding for semantic search);
-- Create relationships (edges)CREATE TABLE relationships ( id SERIAL PRIMARY KEY, source_id INT REFERENCES entities(id), target_id INT REFERENCES entities(id), relationship_type VARCHAR(100), properties JSONB);3. Build the Knowledge Graph
-- Insert entitiesINSERT INTO entities (name, type, description)VALUES ('Alice', 'Person', 'Software Engineer'), ('Bob', 'Person', 'Product Manager'), ('Company X', 'Organization', 'Tech startup');
-- Insert relationshipsINSERT INTO relationships (source_id, target_id, relationship_type, properties)VALUES (1, 3, 'works_at', '{"since": "2022-01-01"}'), (2, 3, 'manages', '{"teams": ["Engineering"]}'), (1, 2, 'reports_to', '{"direct_report": true}');4. Query with Natural Language
-- Cypher-like graph queriesMATCH (p:Person)-[r:works_at]->(o:Organization)RETURN p.name, r.relationship_type, o.name;
-- With semantic enhancementMATCH (p:Person {type: 'Person'})-[]->(org:Organization)WHERE p.name LIKE '%Alice%'RETURN p, org;5. Combine with Vector Search
-- Hybrid query: structure + semanticsSELECT e.name, e.type, e.descriptionFROM entities eWHERE e.type = 'Person'ORDER BY e.embedding <-> to_vector('embedding of "software engineer"')LIMIT 5;Common Use Cases
1. Knowledge Discovery
-- Find all relationships for an entityMATCH (e:Entity {name: 'Company X'})-[r]->(related)RETURN e, r, related;2. Semantic Search with Relationships
-- Find people and their rolesMATCH (p:Person)-[r:works_at]->(o:Organization)WHERE o.name = 'Company X'RETURN p.name, r.properties;3. Recommendation System
-- Find similar people based on relationships and embeddingsSELECT p2.name, COUNT(*) as common_relationshipsFROM entities p1JOIN relationships r1 ON p1.id = r1.source_idJOIN relationships r2 ON r1.target_id = r2.target_idJOIN entities p2 ON r2.source_id = p2.idWHERE p1.name = 'Alice'GROUP BY p2.id, p2.nameORDER BY common_relationships DESC;4. Entity Resolution
-- Find duplicate or similar entitiesSELECT e1.name, e2.nameFROM entities e1JOIN entities e2 ON e1.id < e2.idWHERE e1.type = e2.typeAND e1.embedding <-> e2.embedding < 0.1; -- High similarityPerformance Tips
-
Index Creation: Speed up graph traversals
CREATE INDEX idx_entities_type ON entities(type);CREATE INDEX idx_relationships_type ON relationships(relationship_type);CREATE INDEX idx_embeddings ON entities USING ivfflat (embedding vector_cosine_ops); -
Optimize Vector Search
-- Use appropriate vector indexes for similaritySET search_path TO public;CREATE EXTENSION IF NOT EXISTS vector; -
Batch Operations
-- Insert relationships in batchesINSERT INTO relationships (...) VALUES (...), (...), (...); -
Graph Statistics
SELECTCOUNT(*) as total_entities,COUNT(DISTINCT type) as entity_typesFROM entities;
Troubleshooting
Q: Queries returning too many results?
A: Add WHERE clauses to filter relationships and entity types.
Q: Vector search is slow?
A: Create appropriate vector indexes and check embedding dimensions.
Q: Memory usage high for large graphs?
A: Use pagination and limit result sets.
Best Practices
- Create indexes on frequently queried relationships
- Use appropriate vector dimensions (1536 for large models, smaller for faster processing)
- Batch insert relationships to avoid transaction overhead
- Regular VACUUM to maintain performance
- Use EXPLAIN to optimize queries
Next Steps
- Review
/docs/features/graphrag/USER_GUIDE.mdfor advanced features - Check Neo4j migration guide in
/docs/features/graphrag/NEO4J_MIGRATION_GUIDE.md - Explore Cypher reference in
/docs/features/graphrag/CYPHER_REFERENCE.md
Related Features
- Vector Search:
/docs/features/multimodal-vector/ - Full-Text Search:
/docs/guides/user/FULL_TEXT_SEARCH_TUNING_GUIDE.md - Graph Query:
/docs/features/packages/12-graph-readme.md
Document Version: 1.0 Last Updated: December 30, 2025 Audience: Data engineers, knowledge graph developers Reading Time: 8 minutes