Enhanced EXPLAIN PLAN User Guide

Version: 7.0 Last Updated: 2025-11-17 Status: Phase 1 Complete

Introduction
Quick Start
EXPLAIN Basics
Configuration Tracking
Feature Detection
Optimizer Decisions
Why Not Analysis
Output Formats
Query Optimization Workflow
Best Practices
Troubleshooting
Advanced Topics

Introduction

HeliosDB’s Enhanced EXPLAIN PLAN goes far beyond traditional query plan visualization. It provides:

Configuration Snapshots: Capture 45+ configuration parameters affecting query planning
Feature Detection: Automatically detect 8+ active optimization features (partition pruning, SIMD, etc.)
Optimizer Decision Tracking: See why the optimizer chose specific strategies with cost comparisons
“Why Not” Analysis: Understand why certain optimizations weren’t applied
Query Rewrite Suggestions: Get actionable recommendations to improve query performance
Multiple Output Formats: Text, JSON, YAML, XML, HTML, GraphViz, Mermaid

Key Benefits

Deep Visibility: Understand exactly what your queries are doing
Performance Tuning: Identify bottlenecks and optimization opportunities
Debugging: Diagnose unexpected query plans
Education: Learn how the query optimizer works
Production Monitoring: Track query plan changes over time

Quick Start

Basic EXPLAIN

-- Simple EXPLAIN
EXPLAIN SELECT * FROM users WHERE customer_id = 123;

-- Output:
EXPLAIN PLAN (Enhanced)
Estimated Cost: 50.00
Estimated Rows: 1
Planning Time: 2.34ms

Active Features (2):
  ✓ Predicate Pushdown (IMPLICIT)
    Trigger: 1 predicates pushed to storage layer
    Benefit: Filters applied early, reduces data transfer

  ✓ Index Scan on table 1 (IMPLICIT)
    Trigger: Highly selective predicate on indexed column (0.1% selectivity)
    Benefit: 100x faster than sequential scan
    Savings: 99.0% (10000.00 -> 100.00)

Execution Plan:
-> IndexScan(table_id=1, predicates=1)

EXPLAIN ANALYZE with “Why Not”

EXPLAIN ANALYZE (WHY_NOT)
SELECT * FROM orders
WHERE customer_id = 123;

Different Output Formats

-- JSON format for tools
EXPLAIN (FORMAT JSON) SELECT * FROM users WHERE id = 1;

-- HTML for web viewing
EXPLAIN (FORMAT HTML) SELECT * FROM orders WHERE date > '2024-01-01';

-- GraphViz for diagrams
EXPLAIN (FORMAT GRAPHVIZ) SELECT * FROM complex_join_query;

-- Mermaid for Markdown
EXPLAIN (FORMAT MERMAID) SELECT * FROM sales GROUP BY region;

EXPLAIN Basics

Syntax

EXPLAIN [ ANALYZE ] [ ( option [, ...] ) ] statement

Options:
  FORMAT { TEXT | JSON | YAML | XML | HTML | GRAPHVIZ | MERMAID }
  WHY_NOT { TRUE | FALSE }
  VERBOSE { TRUE | FALSE }
  COSTS { TRUE | FALSE }
  BUFFERS { TRUE | FALSE }
  TIMING { TRUE | FALSE }

Understanding Output

Estimated Cost

The cost is an arbitrary unit representing execution time and resource usage:

Sequential Scan: High cost for large tables (1.0 per page)
Index Scan: Lower cost for selective queries (0.005 per tuple)
Hash Join: Depends on hash table size and probe complexity
Sort: Depends on data volume and available memory

Rule of Thumb:

Cost < 100: Very fast
Cost 100-1000: Fast
Cost 1000-10000: Moderate
Cost > 10000: Slow (consider optimization)

Estimated Rows

Number of rows the optimizer expects this operation to return.

Important: If actual rows differ significantly from estimated rows, your statistics may be stale!

-- Update statistics
ANALYZE users;

Configuration Tracking

Every EXPLAIN captures 45+ configuration parameters affecting the plan:

Memory Settings

work_mem: 256MB
  - Controls memory for sorts and hash tables
  - Increase if seeing "disk-based sort" warnings

shared_buffers: 1024MB
  - Shared memory cache for data pages
  - Typically 25% of system RAM

effective_cache_size: 4096MB
  - Estimate of OS cache available
  - Used for cost calculations

Join Strategy Toggles

enable_hashjoin: true
  - Hash joins for equality conditions
  - Fast for medium-sized tables

enable_mergejoin: true
  - Merge joins for sorted data
  - Efficient for pre-sorted inputs

enable_nestloop: true
  - Nested loop joins for small tables
  - Good when one side is very selective

Scan Strategy Toggles

enable_indexscan: true
  - Use indexes when beneficial

enable_seqscan: true
  - Sequential scans for large portions of table

enable_bitmapscan: true
  - Bitmap scans for multiple index conditions

Parallelism Settings

max_parallel_workers: 8
  - Maximum parallel workers across all queries

max_parallel_workers_per_gather: 4
  - Maximum parallel workers for single operation

min_parallel_table_scan_size_mb: 8
  - Minimum table size for parallelism

Cost Parameters

seq_page_cost: 1.0
  - Cost to read one page sequentially

random_page_cost: 4.0
  - Cost to read one page randomly
  - Higher = optimizer prefers sequential scans
  - Lower = optimizer prefers index scans

cpu_tuple_cost: 0.01
  - Cost to process one tuple

cpu_operator_cost: 0.0025
  - Cost to execute one operator/function

Advanced Features

enable_partition_pruning: true
  - Skip irrelevant partitions based on predicates

enable_simd: true
  - SIMD vectorization for aggregations

enable_jit: false
  - JIT compilation for complex queries (experimental)

Feature Detection

The optimizer automatically detects and reports active features:

Partition Pruning

SELECT * FROM sales
WHERE sale_date BETWEEN '2024-01-01' AND '2024-01-31';

-- Feature Detected:
✓ Partition Pruning (IMPLICIT)
  Trigger: Range predicates on partitioned table (1 predicates)
  Benefit: Skipped 11 of 12 partitions (91.7% I/O savings)
  Savings: 91.7% (12000.00 -> 1000.00)
  Confidence: 70%

When it Activates:

Range predicates on partition key
Equality predicates matching partition boundaries
IN clauses with partition key values

How to Maximize:

Partition on frequently filtered columns (date, region, category)
Use partition-aligned predicates
Keep partition count reasonable (< 1000)

Predicate Pushdown

SELECT * FROM orders
WHERE customer_id = 123 AND status = 'active';

-- Feature Detected:
✓ Predicate Pushdown (IMPLICIT)
  Trigger: 2 predicates pushed to storage layer
  Benefit: Filters applied early, reduces data transfer
  Confidence: 95%

When it Activates:

WHERE clause predicates on base tables
JOIN conditions pushed to table scans
Subquery filters pushed down

Performance Impact:

10-100x reduction in data transfer
Lower memory usage
Faster query execution

SIMD Vectorization

SELECT region, SUM(sales), AVG(price)
FROM sales
GROUP BY region;

-- Feature Detected:
✓ SIMD Vectorization (IMPLICIT)
  Trigger: Numeric aggregation operations (SUM/AVG/COUNT)
  Benefit: 4x speedup on aggregation operations using AVX2/AVX512
  Savings: 75.0% (1000.00 -> 250.00)
  Confidence: 90%

When it Activates:

SUM, AVG, COUNT aggregations
Numeric data types (INT, BIGINT, FLOAT, DOUBLE)
SIMD-capable hardware (AVX2/AVX512)

Performance Impact:

4-8x speedup on supported operations
Automatic - no query changes needed

Index Usage

SELECT * FROM users WHERE email = 'test@example.com';

-- Feature Detected:
✓ Index Scan on table users (IMPLICIT)
  Trigger: Highly selective predicate on indexed column (0.1% selectivity)
  Benefit: 100x faster than sequential scan
  Savings: 99.0% (10000.00 -> 100.00)
  Confidence: 95%

Types of Index Scans:

Index Scan: Read index + fetch table rows
Index-Only Scan: Read index only (covering index)
Bitmap Index Scan: Multiple index conditions combined

Parallelism

SELECT * FROM large_table WHERE category = 'electronics';

-- Feature Detected:
✓ Parallel Sequential Scan (IMPLICIT)
  Trigger: Table size (50GB) exceeds min_parallel_table_scan_size
  Benefit: 4x speedup with 4 parallel workers
  Savings: 75.0% (20000.00 -> 5000.00)
  Workers: 4
  Confidence: 85%

When it Activates:

Table size > min_parallel_table_scan_size_mb
Query cost > parallel_setup_cost
Sufficient parallel workers available

Optimizer Decisions

See exactly why the optimizer chose specific strategies:

Join Method Selection

SELECT * FROM orders o
JOIN customers c ON o.customer_id = c.id;

-- Optimizer Decision:
1. Join Strategy
   CHOSEN: Hash Join (cost: 250.00, rows: 10000)
   Reasoning: Medium-sized tables with equality condition.
              Hash join is optimal.

   REJECTED: Nested Loop Join (cost: 5000.00, 20.0x more expensive)
   Reason: One table too large for nested loop

   REJECTED: Merge Join (cost: 800.00, 3.2x more expensive)
   Reason: Neither input is sorted on join key

Decision Factors:

Table sizes
Join condition types (equality vs range)
Available memory (work_mem)
Pre-sorted data
Index availability

Index Selection

SELECT * FROM users
WHERE email = 'test@example.com' AND status = 'active';

-- Optimizer Decision:
1. Index Selection for users table
   CHOSEN: idx_users_email (cost: 50.00, rows: 1)
   Reasoning: Email is highly selective (0.1%).
              Index lookup is 100x faster than seq scan.

   REJECTED: idx_users_status (cost: 2000.00, 40.0x more expensive)
   Reason: Status has low selectivity (50%). Sequential scan cheaper.

   REJECTED: Sequential Scan (cost: 5000.00, 100.0x more expensive)
   Reason: Table is large (1M rows) and predicate is highly selective

Scan Method Selection

SELECT * FROM products WHERE category = 'electronics';

-- Optimizer Decision:
1. Scan Method for products
   CHOSEN: Sequential Scan (cost: 1000.00, rows: 50000)
   Reasoning: Low selectivity (50% of rows match).
              Sequential scan more efficient.

   REJECTED: Index Scan on idx_category (cost: 3000.00, 3.0x more expensive)
   Reason: High number of matching rows causes excessive random I/O

Selectivity Thresholds:

< 1%: Index scan strongly preferred
1-5%: Index scan usually preferred
5-20%: Depends on index quality and random_page_cost
20%: Sequential scan usually preferred

Why Not Analysis

Understand why certain optimizations weren’t applied:

Index Not Used

EXPLAIN ANALYZE (WHY_NOT)
SELECT * FROM orders WHERE customer_id = 123;

-- Why Not Analysis:
INDEX USAGE ANALYSIS (2 indexes)
────────────────────────────────────────────────────────

1. Index: idx_orders_date
   Table: orders | Type: BTree | Severity: Low
   Reason: Column 'order_date' not in WHERE clause
    Suggestion: Not applicable to this query
   💰 Cost Impact: Would add cost (unnecessary index scan)

2. Index: idx_orders_customer_date (covering index)
   Table: orders | Type: BTree | Severity: High
   Reason: Statistics outdated (analyzed 45 days ago)
    Suggestion: Run ANALYZE orders; to update statistics
              This covering index could eliminate table lookups.
              Estimated cost reduction: 60.0% (50 -> 20)
   💰 Cost Impact: 2.5x slower without fresh stats (20.00 vs 50.00)

Statistics Warnings

STATISTICS WARNINGS (1 tables)
────────────────────────────────────────────────────────

1. Table: orders
   Last Updated: 2024-09-15 | Severity: High
   Freshness: 45 days old (35.0% of data modified since)
   Impact: Stale statistics cause suboptimal query plans
    Action: ANALYZE orders;

Statistics Freshness Guidelines:

< 7 days: Fresh (Severity: Low)
7-14 days: Aging (Severity: Low)
14-30 days: Stale (Severity: Medium)
30 days: Critical (Severity: High)

Impact of Stale Statistics:

Incorrect cardinality estimates
Wrong join order
Suboptimal index selection
Parallel execution disabled

Cardinality Mismatches

CARDINALITY MISMATCHES (1 detected)
────────────────────────────────────────────────────────

1. Operation: Hash Join
   Estimated: 1000 rows | Actual: 100000 rows | Severity: Critical
   Error Rate: 9900.0% (100.0x underestimate)
   Likely Cause: Outdated statistics on join columns or correlated predicates
    Fix: Run ANALYZE on joined tables
          Consider multi-column statistics for correlated columns

Common Causes:

Outdated statistics
Correlated columns (not modeled)
Skewed data distributions
Complex predicates
Missing histograms

Optimization Suggestions

OPTIMIZATION SUGGESTIONS (5 recommendations)
────────────────────────────────────────────────────────

1. Create covering index for GROUP BY
   Category: Indexing | Priority: Medium | Effort: Medium
   Description: A covering index on GROUP BY columns with INCLUDE
                for aggregate columns could enable index-only scan.
    Expected Benefit: 81.0% improvement (1050.00 -> 200.00)
   Implementation:
     • Identify aggregate and group-by columns
     • CREATE INDEX with INCLUDE clause for aggregate columns
     • Test with EXPLAIN ANALYZE

2. Rewrite OR conditions as UNION
   Category: QueryRewrite | Priority: High | Effort: Medium
   Description: OR conditions prevent index usage. Rewriting as UNION ALL
                allows each branch to use appropriate indexes.
    Expected Benefit: 90.0% improvement (5000.00 -> 500.00)
   Implementation:
     • Identify OR conditions in WHERE clause
     • Rewrite: SELECT * FROM t WHERE a=1 OR b=2
     • As: SELECT * FROM t WHERE a=1 UNION ALL SELECT * FROM t WHERE b=2
     • Add DISTINCT if needed to remove duplicates

Output Formats

TEXT (Default)

Human-readable, terminal-friendly output.

EXPLAIN SELECT * FROM users WHERE id = 1;

Best For: Quick debugging, terminal use

JSON

Machine-readable, structured data for tools.

EXPLAIN (FORMAT JSON) SELECT * FROM users WHERE id = 1;

{
  "plan": {
    "node_type": "IndexScan",
    "table_id": "users",
    "estimated_cost": 50.0,
    "estimated_rows": 1
  },
  "active_features": {
    "implicit": [
      {
        "name": "Index Scan",
        "category": "Indexing",
        "savings": {
          "percent_reduction": 99.0
        }
      }
    ]
  }
}

Best For: Tool integration, automation, logging

YAML

Human-friendly structured format.

EXPLAIN (FORMAT YAML) SELECT * FROM users WHERE id = 1;

plan:
  estimated_cost: 50.0
  estimated_rows: 1
  planning_time_ms: 2.34

active_features:
  total: 2
  implicit: 2
  explicit: 0

Best For: Configuration files, documentation

XML

Enterprise tool integration.

EXPLAIN (FORMAT XML) SELECT * FROM users WHERE id = 1;

<?xml version="1.0" encoding="UTF-8"?>
<explain-plan version="7.0">
  <summary>
    <estimated-cost>50.0</estimated-cost>
    <estimated-rows>1</estimated-rows>
  </summary>
  <active-features>
    <feature type="implicit" category="Indexing">
      <name>Index Scan</name>
      <confidence>0.95</confidence>
    </feature>
  </active-features>
</explain-plan>

Best For: Enterprise ETL tools, SOAP APIs, XML parsers

HTML

Web-based visualization with CSS styling.

EXPLAIN (FORMAT HTML) SELECT * FROM users WHERE id = 1;

Generates complete HTML page with:

Responsive design
Color-coded nodes
Interactive elements
Print-friendly layout

Best For: Dashboards, reports, presentations

GraphViz (DOT)

Generate diagrams with GraphViz.

EXPLAIN (FORMAT GRAPHVIZ) SELECT * FROM users u
JOIN orders o ON u.id = o.customer_id;

digraph ExplainPlan {
  rankdir=TB;
  node [shape=box, style=rounded];

  node_0 [label="Hash Join\nCost: 250.00", fillcolor="#ffccbc", style=filled];
  node_1 [label="TableScan\nTable: users", fillcolor="#bbdefb", style=filled];
  node_2 [label="IndexScan\nTable: orders", fillcolor="#bbdefb", style=filled];

  node_0 -> node_1 [label="left"];
  node_0 -> node_2 [label="right"];
}

Render with: dot -Tpng explain.dot -o plan.png

Best For: Architecture diagrams, documentation, presentations

Mermaid

Markdown-based diagrams.

EXPLAIN (FORMAT MERMAID) SELECT * FROM users u
JOIN orders o ON u.id = o.customer_id;

graph TB
  summary["EXPLAIN PLAN<br/>Cost: 250.00<br/>Rows: 10000"]
  summary:::summaryStyle

  node0["Hash Join"]
  node0:::joinStyle

  node1["TableScan<br/>Table: users"]
  node1:::scanStyle

  node2["IndexScan<br/>Table: orders"]
  node2:::scanStyle

  summary --> node0
  node0 -->|left| node1
  node0 -->|right| node2

  classDef summaryStyle fill:#e8f5e9,stroke:#4caf50
  classDef scanStyle fill:#bbdefb,stroke:#2196f3
  classDef joinStyle fill:#ffccbc,stroke:#ff5722

Best For: GitHub/GitLab documentation, Markdown files

Query Optimization Workflow

Step 1: Identify Slow Queries

-- Find slow queries in production
SELECT query, total_time, calls, mean_time
FROM pg_stat_statements
ORDER BY total_time DESC
LIMIT 10;

Step 2: EXPLAIN the Query

EXPLAIN ANALYZE (WHY_NOT, FORMAT TEXT)
<your slow query>;

Step 3: Check Statistics

STATISTICS WARNINGS:
  Table: orders - Last analyzed 45 days ago
  Action: ANALYZE orders;

Fix: Run ANALYZE

ANALYZE orders;

Step 4: Review Active Features

Active Features (1):
  ✓ Sequential Scan (IMPLICIT)
    Should use index instead!

Fix: Check Why Not Analysis

Step 5: Apply Suggestions

OPTIMIZATION SUGGESTIONS:
  1. Create index on customer_id
     CREATE INDEX idx_orders_customer ON orders(customer_id);

Fix: Create recommended index

CREATE INDEX idx_orders_customer ON orders(customer_id);
ANALYZE orders;  -- Update statistics

Step 6: Verify Improvement

EXPLAIN ANALYZE
<your query>;

Expected: Lower cost, index scan used

Best Practices

1. Regular Statistics Updates

-- Automated statistics collection
CREATE EXTENSION pg_cron;

SELECT cron.schedule('nightly-analyze', '0 2 * * *', $$
  ANALYZE VERBOSE;
$$);

2. Monitor Query Plans

-- Track plan changes over time
CREATE TABLE query_plan_history (
  query_id TEXT,
  plan_hash TEXT,
  estimated_cost NUMERIC,
  captured_at TIMESTAMP DEFAULT NOW()
);

-- Log explain output
INSERT INTO query_plan_history (query_id, plan_hash, estimated_cost)
SELECT
  md5(query),
  md5(explain_plan),
  (explain_json->>'estimated_cost')::numeric
FROM pg_stat_statements;

3. Index Maintenance

-- Find unused indexes
SELECT schemaname, tablename, indexname, idx_scan
FROM pg_stat_user_indexes
WHERE idx_scan = 0
  AND indexname NOT LIKE 'pg_toast%'
ORDER BY pg_relation_size(indexrelid) DESC;

-- Drop if truly unused
DROP INDEX IF EXISTS unused_index_name;

4. Configuration Tuning

-- For OLTP (many small queries)
SET work_mem = '64MB';
SET random_page_cost = 1.1;  -- SSD storage
SET effective_cache_size = '8GB';

-- For OLAP (complex analytical queries)
SET work_mem = '512MB';
SET max_parallel_workers_per_gather = 4;
SET enable_partitionwise_join = ON;

5. Query Patterns to Avoid

Avoid:

-- Function on indexed column
SELECT * FROM users WHERE LOWER(email) = 'test@example.com';

-- Leading wildcard
SELECT * FROM products WHERE name LIKE '%phone%';

-- OR conditions preventing index use
SELECT * FROM orders WHERE status = 'pending' OR status = 'processing';

Instead:

-- Use functional index or normalize data
SELECT * FROM users WHERE email = 'test@example.com';

-- Use full-text search
SELECT * FROM products WHERE to_tsvector(name) @@ to_tsquery('phone');

-- Rewrite as UNION
SELECT * FROM orders WHERE status = 'pending'
UNION ALL
SELECT * FROM orders WHERE status = 'processing';

Troubleshooting

Problem: Index Not Being Used

Symptoms:

EXPLAIN shows Sequential Scan
Query is slow despite having index

Diagnosis:

EXPLAIN ANALYZE (WHY_NOT)
SELECT * FROM users WHERE email = 'test@example.com';

Common Causes:

Stale Statistics
```
Fix: ANALYZE users;
```

Low Selectivity

Index used only if < 20% of rows match
Fix: Add more selective predicates

Data Type Mismatch

-- Wrong: id is INTEGER
SELECT * FROM users WHERE id = '123';

-- Right:
SELECT * FROM users WHERE id = 123;

Function on Column

-- Wrong:
SELECT * FROM users WHERE LOWER(email) = 'test@example.com';

-- Right: Create functional index
CREATE INDEX idx_users_email_lower ON users(LOWER(email));

Problem: Wrong Join Order

Symptoms:

Hash join on large tables first
Cardinality mismatches

Diagnosis:

EXPLAIN ANALYZE
SELECT * FROM a
JOIN b ON a.id = b.a_id
JOIN c ON b.id = c.b_id;

Fix:

-- Update statistics
ANALYZE a, b, c;

-- Force different join order (last resort)
SET join_collapse_limit = 1;
SELECT * FROM a
JOIN (b JOIN c ON b.id = c.b_id) ON a.id = b.a_id;

Problem: Parallel Query Not Activating

Symptoms:

Expected parallel workers: 4
Actual parallel workers: 0

Common Causes:

Table Too Small

-- Check table size
SELECT pg_size_pretty(pg_relation_size('table_name'));

-- Adjust threshold
SET min_parallel_table_scan_size = '1MB';

Insufficient Workers

SHOW max_parallel_workers;
SHOW max_parallel_workers_per_gather;

-- Increase if needed
SET max_parallel_workers = 8;

Query Too Cheap

-- Parallel setup has overhead
SHOW parallel_setup_cost;  -- Default: 1000

-- Lower for testing (not recommended in production)
SET parallel_setup_cost = 100;

Advanced Topics

Custom Statistics

-- Create extended statistics for correlated columns
CREATE STATISTICS orders_customer_date_stats (dependencies)
ON customer_id, order_date FROM orders;

ANALYZE orders;

Query Hints (Experimental)

-- Force index usage (PostgreSQL-style hint)
SELECT /*+ IndexScan(users idx_users_email) */
* FROM users WHERE email = 'test@example.com';

-- Force join order
SELECT /*+ Leading((a b) c) */
* FROM a, b, c WHERE ...;

Plan Stability

-- Freeze a good plan
CREATE EXTENSION pg_plan_freeze;

SELECT pg_plan_freeze('slow_query_md5_hash', 'good_plan_hash');

Cost Model Tuning

-- For SSD storage
SET random_page_cost = 1.1;
SET seq_page_cost = 1.0;

-- For HDD storage
SET random_page_cost = 4.0;
SET seq_page_cost = 1.0;

-- For in-memory data
SET random_page_cost = 0.1;
SET seq_page_cost = 0.1;

Appendix A: Configuration Reference

Complete Parameter List

Parameter	Default	Range	Impact
work_mem	256MB	64KB-2GB	Sort/Hash memory
shared_buffers	1024MB	128MB-40% RAM	Page cache
effective_cache_size	4096MB	1GB-75% RAM	Cost calculations
random_page_cost	4.0	0.1-10.0	Index vs scan choice
cpu_tuple_cost	0.01	0.001-0.1	Processing cost
max_parallel_workers	8	0-32	Parallel queries

Feature Flags

Flag	Default	Description
enable_simd	true	SIMD vectorization
enable_hashjoin	true	Hash joins
enable_mergejoin	true	Merge joins
enable_nestloop	true	Nested loop joins
enable_indexscan	true	Index scans
enable_seqscan	true	Sequential scans
enable_partition_pruning	true	Partition pruning

Appendix B: Error Codes

Code	Message	Solution
E001	Statistics too old	Run ANALYZE
E002	No suitable index	Create index
E003	Cardinality mismatch	Update statistics, check correlations
E004	Out of memory (work_mem)	Increase work_mem
E005	Query too complex	Simplify or break into CTEs

Appendix C: Glossary

Cardinality: Number of rows in a result set
Selectivity: Fraction of rows matching a predicate (0.0 to 1.0)
Cost: Arbitrary unit representing execution time
Predicate: Filter condition (WHERE clause)
Covering Index: Index containing all needed columns
Sequential Scan: Read entire table sequentially
Index Scan: Read via index structure
Hash Join: Join using hash table
Merge Join: Join pre-sorted inputs
Nested Loop: Iterate outer, probe inner for each row

Appendix D: Further Reading

Document Version: 1.0 HeliosDB Version: 7.0 Feature Status: Phase 1 Complete (100%) Last Updated: 2025-11-17

Enhanced EXPLAIN PLAN User Guide

Enhanced EXPLAIN PLAN User Guide

Table of Contents

Introduction

Key Benefits

Quick Start

Basic EXPLAIN

EXPLAIN ANALYZE with “Why Not”

Different Output Formats

EXPLAIN Basics

Syntax

Understanding Output

Estimated Cost

Estimated Rows

Configuration Tracking

Memory Settings

Join Strategy Toggles

Scan Strategy Toggles

Parallelism Settings

Cost Parameters

Advanced Features

Feature Detection

Partition Pruning

Predicate Pushdown

SIMD Vectorization

Index Usage

Parallelism

Optimizer Decisions

Join Method Selection

Index Selection

Scan Method Selection

Why Not Analysis

Index Not Used

Statistics Warnings

Cardinality Mismatches

Optimization Suggestions

Output Formats

TEXT (Default)

JSON

YAML

XML

HTML

GraphViz (DOT)

Mermaid

Query Optimization Workflow

Step 1: Identify Slow Queries

Step 2: EXPLAIN the Query

Step 3: Check Statistics

Step 4: Review Active Features

Step 5: Apply Suggestions

Step 6: Verify Improvement

Best Practices

1. Regular Statistics Updates

2. Monitor Query Plans

3. Index Maintenance

4. Configuration Tuning

5. Query Patterns to Avoid

Troubleshooting

Problem: Index Not Being Used

Problem: Wrong Join Order

Problem: Parallel Query Not Activating

Advanced Topics

Custom Statistics

Query Hints (Experimental)

Plan Stability

Cost Model Tuning

Appendix A: Configuration Reference

Complete Parameter List

Feature Flags

Appendix B: Error Codes

Appendix C: Glossary

Appendix D: Further Reading