Skip to content

Enhanced EXPLAIN PLAN User Guide

Enhanced EXPLAIN PLAN User Guide

Version: 7.0 Last Updated: 2025-11-17 Status: Phase 1 Complete

Table of Contents

  1. Introduction
  2. Quick Start
  3. EXPLAIN Basics
  4. Configuration Tracking
  5. Feature Detection
  6. Optimizer Decisions
  7. Why Not Analysis
  8. Output Formats
  9. Query Optimization Workflow
  10. Best Practices
  11. Troubleshooting
  12. Advanced Topics

Introduction

HeliosDB’s Enhanced EXPLAIN PLAN goes far beyond traditional query plan visualization. It provides:

  • Configuration Snapshots: Capture 45+ configuration parameters affecting query planning
  • Feature Detection: Automatically detect 8+ active optimization features (partition pruning, SIMD, etc.)
  • Optimizer Decision Tracking: See why the optimizer chose specific strategies with cost comparisons
  • “Why Not” Analysis: Understand why certain optimizations weren’t applied
  • Query Rewrite Suggestions: Get actionable recommendations to improve query performance
  • Multiple Output Formats: Text, JSON, YAML, XML, HTML, GraphViz, Mermaid

Key Benefits

  1. Deep Visibility: Understand exactly what your queries are doing
  2. Performance Tuning: Identify bottlenecks and optimization opportunities
  3. Debugging: Diagnose unexpected query plans
  4. Education: Learn how the query optimizer works
  5. Production Monitoring: Track query plan changes over time

Quick Start

Basic EXPLAIN

-- Simple EXPLAIN
EXPLAIN SELECT * FROM users WHERE customer_id = 123;
-- Output:
EXPLAIN PLAN (Enhanced)
Estimated Cost: 50.00
Estimated Rows: 1
Planning Time: 2.34ms
Active Features (2):
Predicate Pushdown (IMPLICIT)
Trigger: 1 predicates pushed to storage layer
Benefit: Filters applied early, reduces data transfer
Index Scan on table 1 (IMPLICIT)
Trigger: Highly selective predicate on indexed column (0.1% selectivity)
Benefit: 100x faster than sequential scan
Savings: 99.0% (10000.00 -> 100.00)
Execution Plan:
-> IndexScan(table_id=1, predicates=1)

EXPLAIN ANALYZE with “Why Not”

EXPLAIN ANALYZE (WHY_NOT)
SELECT * FROM orders
WHERE customer_id = 123;

Different Output Formats

-- JSON format for tools
EXPLAIN (FORMAT JSON) SELECT * FROM users WHERE id = 1;
-- HTML for web viewing
EXPLAIN (FORMAT HTML) SELECT * FROM orders WHERE date > '2024-01-01';
-- GraphViz for diagrams
EXPLAIN (FORMAT GRAPHVIZ) SELECT * FROM complex_join_query;
-- Mermaid for Markdown
EXPLAIN (FORMAT MERMAID) SELECT * FROM sales GROUP BY region;

EXPLAIN Basics

Syntax

EXPLAIN [ ANALYZE ] [ ( option [, ...] ) ] statement
Options:
FORMAT { TEXT | JSON | YAML | XML | HTML | GRAPHVIZ | MERMAID }
WHY_NOT { TRUE | FALSE }
VERBOSE { TRUE | FALSE }
COSTS { TRUE | FALSE }
BUFFERS { TRUE | FALSE }
TIMING { TRUE | FALSE }

Understanding Output

Estimated Cost

The cost is an arbitrary unit representing execution time and resource usage:

  • Sequential Scan: High cost for large tables (1.0 per page)
  • Index Scan: Lower cost for selective queries (0.005 per tuple)
  • Hash Join: Depends on hash table size and probe complexity
  • Sort: Depends on data volume and available memory

Rule of Thumb:

  • Cost < 100: Very fast
  • Cost 100-1000: Fast
  • Cost 1000-10000: Moderate
  • Cost > 10000: Slow (consider optimization)

Estimated Rows

Number of rows the optimizer expects this operation to return.

Important: If actual rows differ significantly from estimated rows, your statistics may be stale!

-- Update statistics
ANALYZE users;

Configuration Tracking

Every EXPLAIN captures 45+ configuration parameters affecting the plan:

Memory Settings

work_mem: 256MB
- Controls memory for sorts and hash tables
- Increase if seeing "disk-based sort" warnings
shared_buffers: 1024MB
- Shared memory cache for data pages
- Typically 25% of system RAM
effective_cache_size: 4096MB
- Estimate of OS cache available
- Used for cost calculations

Join Strategy Toggles

enable_hashjoin: true
- Hash joins for equality conditions
- Fast for medium-sized tables
enable_mergejoin: true
- Merge joins for sorted data
- Efficient for pre-sorted inputs
enable_nestloop: true
- Nested loop joins for small tables
- Good when one side is very selective

Scan Strategy Toggles

enable_indexscan: true
- Use indexes when beneficial
enable_seqscan: true
- Sequential scans for large portions of table
enable_bitmapscan: true
- Bitmap scans for multiple index conditions

Parallelism Settings

max_parallel_workers: 8
- Maximum parallel workers across all queries
max_parallel_workers_per_gather: 4
- Maximum parallel workers for single operation
min_parallel_table_scan_size_mb: 8
- Minimum table size for parallelism

Cost Parameters

seq_page_cost: 1.0
- Cost to read one page sequentially
random_page_cost: 4.0
- Cost to read one page randomly
- Higher = optimizer prefers sequential scans
- Lower = optimizer prefers index scans
cpu_tuple_cost: 0.01
- Cost to process one tuple
cpu_operator_cost: 0.0025
- Cost to execute one operator/function

Advanced Features

enable_partition_pruning: true
- Skip irrelevant partitions based on predicates
enable_simd: true
- SIMD vectorization for aggregations
enable_jit: false
- JIT compilation for complex queries (experimental)

Feature Detection

The optimizer automatically detects and reports active features:

Partition Pruning

SELECT * FROM sales
WHERE sale_date BETWEEN '2024-01-01' AND '2024-01-31';
-- Feature Detected:
Partition Pruning (IMPLICIT)
Trigger: Range predicates on partitioned table (1 predicates)
Benefit: Skipped 11 of 12 partitions (91.7% I/O savings)
Savings: 91.7% (12000.00 -> 1000.00)
Confidence: 70%

When it Activates:

  • Range predicates on partition key
  • Equality predicates matching partition boundaries
  • IN clauses with partition key values

How to Maximize:

  • Partition on frequently filtered columns (date, region, category)
  • Use partition-aligned predicates
  • Keep partition count reasonable (< 1000)

Predicate Pushdown

SELECT * FROM orders
WHERE customer_id = 123 AND status = 'active';
-- Feature Detected:
Predicate Pushdown (IMPLICIT)
Trigger: 2 predicates pushed to storage layer
Benefit: Filters applied early, reduces data transfer
Confidence: 95%

When it Activates:

  • WHERE clause predicates on base tables
  • JOIN conditions pushed to table scans
  • Subquery filters pushed down

Performance Impact:

  • 10-100x reduction in data transfer
  • Lower memory usage
  • Faster query execution

SIMD Vectorization

SELECT region, SUM(sales), AVG(price)
FROM sales
GROUP BY region;
-- Feature Detected:
✓ SIMD Vectorization (IMPLICIT)
Trigger: Numeric aggregation operations (SUM/AVG/COUNT)
Benefit: 4x speedup on aggregation operations using AVX2/AVX512
Savings: 75.0% (1000.00 -> 250.00)
Confidence: 90%

When it Activates:

  • SUM, AVG, COUNT aggregations
  • Numeric data types (INT, BIGINT, FLOAT, DOUBLE)
  • SIMD-capable hardware (AVX2/AVX512)

Performance Impact:

  • 4-8x speedup on supported operations
  • Automatic - no query changes needed

Index Usage

SELECT * FROM users WHERE email = 'test@example.com';
-- Feature Detected:
Index Scan on table users (IMPLICIT)
Trigger: Highly selective predicate on indexed column (0.1% selectivity)
Benefit: 100x faster than sequential scan
Savings: 99.0% (10000.00 -> 100.00)
Confidence: 95%

Types of Index Scans:

  • Index Scan: Read index + fetch table rows
  • Index-Only Scan: Read index only (covering index)
  • Bitmap Index Scan: Multiple index conditions combined

Parallelism

SELECT * FROM large_table WHERE category = 'electronics';
-- Feature Detected:
✓ Parallel Sequential Scan (IMPLICIT)
Trigger: Table size (50GB) exceeds min_parallel_table_scan_size
Benefit: 4x speedup with 4 parallel workers
Savings: 75.0% (20000.00 -> 5000.00)
Workers: 4
Confidence: 85%

When it Activates:

  • Table size > min_parallel_table_scan_size_mb
  • Query cost > parallel_setup_cost
  • Sufficient parallel workers available

Optimizer Decisions

See exactly why the optimizer chose specific strategies:

Join Method Selection

SELECT * FROM orders o
JOIN customers c ON o.customer_id = c.id;
-- Optimizer Decision:
1. Join Strategy
CHOSEN: Hash Join (cost: 250.00, rows: 10000)
Reasoning: Medium-sized tables with equality condition.
Hash join is optimal.
REJECTED: Nested Loop Join (cost: 5000.00, 20.0x more expensive)
Reason: One table too large for nested loop
REJECTED: Merge Join (cost: 800.00, 3.2x more expensive)
Reason: Neither input is sorted on join key

Decision Factors:

  • Table sizes
  • Join condition types (equality vs range)
  • Available memory (work_mem)
  • Pre-sorted data
  • Index availability

Index Selection

SELECT * FROM users
WHERE email = 'test@example.com' AND status = 'active';
-- Optimizer Decision:
1. Index Selection for users table
CHOSEN: idx_users_email (cost: 50.00, rows: 1)
Reasoning: Email is highly selective (0.1%).
Index lookup is 100x faster than seq scan.
REJECTED: idx_users_status (cost: 2000.00, 40.0x more expensive)
Reason: Status has low selectivity (50%). Sequential scan cheaper.
REJECTED: Sequential Scan (cost: 5000.00, 100.0x more expensive)
Reason: Table is large (1M rows) and predicate is highly selective

Scan Method Selection

SELECT * FROM products WHERE category = 'electronics';
-- Optimizer Decision:
1. Scan Method for products
CHOSEN: Sequential Scan (cost: 1000.00, rows: 50000)
Reasoning: Low selectivity (50% of rows match).
Sequential scan more efficient.
REJECTED: Index Scan on idx_category (cost: 3000.00, 3.0x more expensive)
Reason: High number of matching rows causes excessive random I/O

Selectivity Thresholds:

  • < 1%: Index scan strongly preferred
  • 1-5%: Index scan usually preferred
  • 5-20%: Depends on index quality and random_page_cost
  • 20%: Sequential scan usually preferred


Why Not Analysis

Understand why certain optimizations weren’t applied:

Index Not Used

EXPLAIN ANALYZE (WHY_NOT)
SELECT * FROM orders WHERE customer_id = 123;
-- Why Not Analysis:
INDEX USAGE ANALYSIS (2 indexes)
────────────────────────────────────────────────────────
1. Index: idx_orders_date
Table: orders | Type: BTree | Severity: Low
Reason: Column 'order_date' not in WHERE clause
Suggestion: Not applicable to this query
💰 Cost Impact: Would add cost (unnecessary index scan)
2. Index: idx_orders_customer_date (covering index)
Table: orders | Type: BTree | Severity: High
Reason: Statistics outdated (analyzed 45 days ago)
Suggestion: Run ANALYZE orders; to update statistics
This covering index could eliminate table lookups.
Estimated cost reduction: 60.0% (50 -> 20)
💰 Cost Impact: 2.5x slower without fresh stats (20.00 vs 50.00)

Statistics Warnings

STATISTICS WARNINGS (1 tables)
────────────────────────────────────────────────────────
1. Table: orders
Last Updated: 2024-09-15 | Severity: High
Freshness: 45 days old (35.0% of data modified since)
Impact: Stale statistics cause suboptimal query plans
Action: ANALYZE orders;

Statistics Freshness Guidelines:

  • < 7 days: Fresh (Severity: Low)
  • 7-14 days: Aging (Severity: Low)
  • 14-30 days: Stale (Severity: Medium)
  • 30 days: Critical (Severity: High)

Impact of Stale Statistics:

  • Incorrect cardinality estimates
  • Wrong join order
  • Suboptimal index selection
  • Parallel execution disabled

Cardinality Mismatches

CARDINALITY MISMATCHES (1 detected)
────────────────────────────────────────────────────────
1. Operation: Hash Join
Estimated: 1000 rows | Actual: 100000 rows | Severity: Critical
Error Rate: 9900.0% (100.0x underestimate)
Likely Cause: Outdated statistics on join columns or correlated predicates
Fix: Run ANALYZE on joined tables
Consider multi-column statistics for correlated columns

Common Causes:

  • Outdated statistics
  • Correlated columns (not modeled)
  • Skewed data distributions
  • Complex predicates
  • Missing histograms

Optimization Suggestions

OPTIMIZATION SUGGESTIONS (5 recommendations)
────────────────────────────────────────────────────────
1. Create covering index for GROUP BY
Category: Indexing | Priority: Medium | Effort: Medium
Description: A covering index on GROUP BY columns with INCLUDE
for aggregate columns could enable index-only scan.
Expected Benefit: 81.0% improvement (1050.00 -> 200.00)
Implementation:
• Identify aggregate and group-by columns
CREATE INDEX with INCLUDE clause for aggregate columns
• Test with EXPLAIN ANALYZE
2. Rewrite OR conditions as UNION
Category: QueryRewrite | Priority: High | Effort: Medium
Description: OR conditions prevent index usage. Rewriting as UNION ALL
allows each branch to use appropriate indexes.
Expected Benefit: 90.0% improvement (5000.00 -> 500.00)
Implementation:
• Identify OR conditions in WHERE clause
• Rewrite: SELECT * FROM t WHERE a=1 OR b=2
As: SELECT * FROM t WHERE a=1 UNION ALL SELECT * FROM t WHERE b=2
Add DISTINCT if needed to remove duplicates

Output Formats

TEXT (Default)

Human-readable, terminal-friendly output.

EXPLAIN SELECT * FROM users WHERE id = 1;

Best For: Quick debugging, terminal use

JSON

Machine-readable, structured data for tools.

EXPLAIN (FORMAT JSON) SELECT * FROM users WHERE id = 1;
{
"plan": {
"node_type": "IndexScan",
"table_id": "users",
"estimated_cost": 50.0,
"estimated_rows": 1
},
"active_features": {
"implicit": [
{
"name": "Index Scan",
"category": "Indexing",
"savings": {
"percent_reduction": 99.0
}
}
]
}
}

Best For: Tool integration, automation, logging

YAML

Human-friendly structured format.

EXPLAIN (FORMAT YAML) SELECT * FROM users WHERE id = 1;
plan:
estimated_cost: 50.0
estimated_rows: 1
planning_time_ms: 2.34
active_features:
total: 2
implicit: 2
explicit: 0

Best For: Configuration files, documentation

XML

Enterprise tool integration.

EXPLAIN (FORMAT XML) SELECT * FROM users WHERE id = 1;
<?xml version="1.0" encoding="UTF-8"?>
<explain-plan version="7.0">
<summary>
<estimated-cost>50.0</estimated-cost>
<estimated-rows>1</estimated-rows>
</summary>
<active-features>
<feature type="implicit" category="Indexing">
<name>Index Scan</name>
<confidence>0.95</confidence>
</feature>
</active-features>
</explain-plan>

Best For: Enterprise ETL tools, SOAP APIs, XML parsers

HTML

Web-based visualization with CSS styling.

EXPLAIN (FORMAT HTML) SELECT * FROM users WHERE id = 1;

Generates complete HTML page with:

  • Responsive design
  • Color-coded nodes
  • Interactive elements
  • Print-friendly layout

Best For: Dashboards, reports, presentations

GraphViz (DOT)

Generate diagrams with GraphViz.

EXPLAIN (FORMAT GRAPHVIZ) SELECT * FROM users u
JOIN orders o ON u.id = o.customer_id;
digraph ExplainPlan {
rankdir=TB;
node [shape=box, style=rounded];
node_0 [label="Hash Join\nCost: 250.00", fillcolor="#ffccbc", style=filled];
node_1 [label="TableScan\nTable: users", fillcolor="#bbdefb", style=filled];
node_2 [label="IndexScan\nTable: orders", fillcolor="#bbdefb", style=filled];
node_0 -> node_1 [label="left"];
node_0 -> node_2 [label="right"];
}

Render with: dot -Tpng explain.dot -o plan.png

Best For: Architecture diagrams, documentation, presentations

Mermaid

Markdown-based diagrams.

EXPLAIN (FORMAT MERMAID) SELECT * FROM users u
JOIN orders o ON u.id = o.customer_id;
graph TB
summary["EXPLAIN PLAN<br/>Cost: 250.00<br/>Rows: 10000"]
summary:::summaryStyle
node0["Hash Join"]
node0:::joinStyle
node1["TableScan<br/>Table: users"]
node1:::scanStyle
node2["IndexScan<br/>Table: orders"]
node2:::scanStyle
summary --> node0
node0 -->|left| node1
node0 -->|right| node2
classDef summaryStyle fill:#e8f5e9,stroke:#4caf50
classDef scanStyle fill:#bbdefb,stroke:#2196f3
classDef joinStyle fill:#ffccbc,stroke:#ff5722

Best For: GitHub/GitLab documentation, Markdown files


Query Optimization Workflow

Step 1: Identify Slow Queries

-- Find slow queries in production
SELECT query, total_time, calls, mean_time
FROM pg_stat_statements
ORDER BY total_time DESC
LIMIT 10;

Step 2: EXPLAIN the Query

EXPLAIN ANALYZE (WHY_NOT, FORMAT TEXT)
<your slow query>;

Step 3: Check Statistics

STATISTICS WARNINGS:
Table: orders - Last analyzed 45 days ago
Action: ANALYZE orders;

Fix: Run ANALYZE

ANALYZE orders;

Step 4: Review Active Features

Active Features (1):
✓ Sequential Scan (IMPLICIT)
Should use index instead!

Fix: Check Why Not Analysis

Step 5: Apply Suggestions

OPTIMIZATION SUGGESTIONS:
1. Create index on customer_id
CREATE INDEX idx_orders_customer ON orders(customer_id);

Fix: Create recommended index

CREATE INDEX idx_orders_customer ON orders(customer_id);
ANALYZE orders; -- Update statistics

Step 6: Verify Improvement

EXPLAIN ANALYZE
<your query>;

Expected: Lower cost, index scan used


Best Practices

1. Regular Statistics Updates

-- Automated statistics collection
CREATE EXTENSION pg_cron;
SELECT cron.schedule('nightly-analyze', '0 2 * * *', $$
ANALYZE VERBOSE;
$$);

2. Monitor Query Plans

-- Track plan changes over time
CREATE TABLE query_plan_history (
query_id TEXT,
plan_hash TEXT,
estimated_cost NUMERIC,
captured_at TIMESTAMP DEFAULT NOW()
);
-- Log explain output
INSERT INTO query_plan_history (query_id, plan_hash, estimated_cost)
SELECT
md5(query),
md5(explain_plan),
(explain_json->>'estimated_cost')::numeric
FROM pg_stat_statements;

3. Index Maintenance

-- Find unused indexes
SELECT schemaname, tablename, indexname, idx_scan
FROM pg_stat_user_indexes
WHERE idx_scan = 0
AND indexname NOT LIKE 'pg_toast%'
ORDER BY pg_relation_size(indexrelid) DESC;
-- Drop if truly unused
DROP INDEX IF EXISTS unused_index_name;

4. Configuration Tuning

-- For OLTP (many small queries)
SET work_mem = '64MB';
SET random_page_cost = 1.1; -- SSD storage
SET effective_cache_size = '8GB';
-- For OLAP (complex analytical queries)
SET work_mem = '512MB';
SET max_parallel_workers_per_gather = 4;
SET enable_partitionwise_join = ON;

5. Query Patterns to Avoid

Avoid:

-- Function on indexed column
SELECT * FROM users WHERE LOWER(email) = 'test@example.com';
-- Leading wildcard
SELECT * FROM products WHERE name LIKE '%phone%';
-- OR conditions preventing index use
SELECT * FROM orders WHERE status = 'pending' OR status = 'processing';

Instead:

-- Use functional index or normalize data
SELECT * FROM users WHERE email = 'test@example.com';
-- Use full-text search
SELECT * FROM products WHERE to_tsvector(name) @@ to_tsquery('phone');
-- Rewrite as UNION
SELECT * FROM orders WHERE status = 'pending'
UNION ALL
SELECT * FROM orders WHERE status = 'processing';

Troubleshooting

Problem: Index Not Being Used

Symptoms:

  • EXPLAIN shows Sequential Scan
  • Query is slow despite having index

Diagnosis:

EXPLAIN ANALYZE (WHY_NOT)
SELECT * FROM users WHERE email = 'test@example.com';

Common Causes:

  1. Stale Statistics

    Fix: ANALYZE users;
  2. Low Selectivity

    Index used only if < 20% of rows match
    Fix: Add more selective predicates
  3. Data Type Mismatch

    -- Wrong: id is INTEGER
    SELECT * FROM users WHERE id = '123';
    -- Right:
    SELECT * FROM users WHERE id = 123;
  4. Function on Column

    -- Wrong:
    SELECT * FROM users WHERE LOWER(email) = 'test@example.com';
    -- Right: Create functional index
    CREATE INDEX idx_users_email_lower ON users(LOWER(email));

Problem: Wrong Join Order

Symptoms:

  • Hash join on large tables first
  • Cardinality mismatches

Diagnosis:

EXPLAIN ANALYZE
SELECT * FROM a
JOIN b ON a.id = b.a_id
JOIN c ON b.id = c.b_id;

Fix:

-- Update statistics
ANALYZE a, b, c;
-- Force different join order (last resort)
SET join_collapse_limit = 1;
SELECT * FROM a
JOIN (b JOIN c ON b.id = c.b_id) ON a.id = b.a_id;

Problem: Parallel Query Not Activating

Symptoms:

  • Expected parallel workers: 4
  • Actual parallel workers: 0

Common Causes:

  1. Table Too Small

    -- Check table size
    SELECT pg_size_pretty(pg_relation_size('table_name'));
    -- Adjust threshold
    SET min_parallel_table_scan_size = '1MB';
  2. Insufficient Workers

    SHOW max_parallel_workers;
    SHOW max_parallel_workers_per_gather;
    -- Increase if needed
    SET max_parallel_workers = 8;
  3. Query Too Cheap

    -- Parallel setup has overhead
    SHOW parallel_setup_cost; -- Default: 1000
    -- Lower for testing (not recommended in production)
    SET parallel_setup_cost = 100;

Advanced Topics

Custom Statistics

-- Create extended statistics for correlated columns
CREATE STATISTICS orders_customer_date_stats (dependencies)
ON customer_id, order_date FROM orders;
ANALYZE orders;

Query Hints (Experimental)

-- Force index usage (PostgreSQL-style hint)
SELECT /*+ IndexScan(users idx_users_email) */
* FROM users WHERE email = 'test@example.com';
-- Force join order
SELECT /*+ Leading((a b) c) */
* FROM a, b, c WHERE ...;

Plan Stability

-- Freeze a good plan
CREATE EXTENSION pg_plan_freeze;
SELECT pg_plan_freeze('slow_query_md5_hash', 'good_plan_hash');

Cost Model Tuning

-- For SSD storage
SET random_page_cost = 1.1;
SET seq_page_cost = 1.0;
-- For HDD storage
SET random_page_cost = 4.0;
SET seq_page_cost = 1.0;
-- For in-memory data
SET random_page_cost = 0.1;
SET seq_page_cost = 0.1;

Appendix A: Configuration Reference

Complete Parameter List

ParameterDefaultRangeImpact
work_mem256MB64KB-2GBSort/Hash memory
shared_buffers1024MB128MB-40% RAMPage cache
effective_cache_size4096MB1GB-75% RAMCost calculations
random_page_cost4.00.1-10.0Index vs scan choice
cpu_tuple_cost0.010.001-0.1Processing cost
max_parallel_workers80-32Parallel queries

Feature Flags

FlagDefaultDescription
enable_simdtrueSIMD vectorization
enable_hashjointrueHash joins
enable_mergejointrueMerge joins
enable_nestlooptrueNested loop joins
enable_indexscantrueIndex scans
enable_seqscantrueSequential scans
enable_partition_pruningtruePartition pruning

Appendix B: Error Codes

CodeMessageSolution
E001Statistics too oldRun ANALYZE
E002No suitable indexCreate index
E003Cardinality mismatchUpdate statistics, check correlations
E004Out of memory (work_mem)Increase work_mem
E005Query too complexSimplify or break into CTEs

Appendix C: Glossary

  • Cardinality: Number of rows in a result set
  • Selectivity: Fraction of rows matching a predicate (0.0 to 1.0)
  • Cost: Arbitrary unit representing execution time
  • Predicate: Filter condition (WHERE clause)
  • Covering Index: Index containing all needed columns
  • Sequential Scan: Read entire table sequentially
  • Index Scan: Read via index structure
  • Hash Join: Join using hash table
  • Merge Join: Join pre-sorted inputs
  • Nested Loop: Iterate outer, probe inner for each row

Appendix D: Further Reading

  1. PostgreSQL Query Performance Tuning
  2. Use The Index, Luke!
  3. HeliosDB Architecture Documentation
  4. Query Optimizer Internals

Document Version: 1.0 HeliosDB Version: 7.0 Feature Status: Phase 1 Complete (100%) Last Updated: 2025-11-17