SQL Performance Tuning with EXPLAIN PLAN: Understanding Query Optimization

Introduction
Why SQL Performance Tuning Matters
What Is EXPLAIN PLAN?
Basic Usage of EXPLAIN PLAN (PostgreSQL, MySQL, Oracle)
Understanding Query Execution Stages
Interpreting Rows, Cost, Width, Time
Common Operators in Query Plans
Seq Scan vs Index Scan
Join Types and Their Cost Implications
Sorting, Filtering, Aggregation Overheads
Using ANALYZE for Actual Execution Times
Detecting Missing Indexes
Query Rewriting for Optimization
Index Selection and Composite Indexes
Table Statistics and Autovacuum (PostgreSQL)
Partitioning and Clustering for Performance
Caching and Materialized Views
Performance Pitfalls and Anti-Patterns
Real-World Tuning Example
Summary and What’s Next

1. Introduction

No matter how well-designed your database is, inefficient SQL queries can lead to performance bottlenecks. EXPLAIN PLAN is a diagnostic tool that helps you understand how a query is executed, and how to improve it.

2. Why SQL Performance Tuning Matters

Faster query execution
Reduced cloud/data warehouse costs
Better user experience for dashboards
Scalability for growing data volumes
Helps meet SLAs in production systems

3. What Is EXPLAIN PLAN?

EXPLAIN (or EXPLAIN PLAN) is a SQL command that shows the query execution strategy chosen by the database optimizer.

It tells you:

Which indexes (if any) are used
How rows are scanned or filtered
The expected cost and row count
The execution order of joins and filters

4. Basic Usage of EXPLAIN PLAN

PostgreSQL:

EXPLAIN SELECT * FROM orders WHERE customer_id = 123;

MySQL:

EXPLAIN SELECT * FROM orders WHERE customer_id = 123;

Oracle:

EXPLAIN PLAN FOR SELECT * FROM orders WHERE customer_id = 123;
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY);

Use EXPLAIN ANALYZE (PostgreSQL) to run the query and show actual performance.

5. Understanding Query Execution Stages

Typical stages include:

Table Scan (Seq Scan or Index Scan)
Filter
Sort
Join (Nested Loop, Hash Join, Merge Join)
Aggregate (GROUP BY, COUNT, SUM)
Output rows

6. Interpreting Rows, Cost, Width, Time

Example (PostgreSQL):

Seq Scan on orders  (cost=0.00..431.00 rows=21000 width=64)

Metric	Meaning
Cost	Estimated time cost (startup..total)
Rows	Estimated number of output rows
Width	Estimated row size in bytes

Use these estimates to evaluate efficiency and selectivity.

7. Common Operators in Query Plans

Operator	Description
Seq Scan	Sequential scan (entire table)
Index Scan	Scan using index (ordered access)
Index Only	Index has all required columns
Bitmap Index	Efficient for multi-key filtering
Hash Join	Fast join for large unordered sets
Nested Loop	Best for small joined tables

8. Seq Scan vs Index Scan

Seq Scan is used when:

The table is small
Index is not selective
The cost of random access is too high

Index Scan is preferred when:

Filtering on indexed columns
Returning a few rows

EXPLAIN SELECT * FROM users WHERE id = 5;

Should show Index Scan using pk_users.

9. Join Types and Their Cost Implications

Join Type	Use Case
Nested Loop	Small inner table, indexed
Hash Join	Large unsorted tables
Merge Join	Both tables sorted on join key

Joins can dominate query cost — reordering tables or adding indexes may help.

10. Sorting, Filtering, Aggregation Overheads

Sorting is expensive on large datasets:

ORDER BY created_at DESC

Adding an index on created_at DESC can eliminate the sort phase.

Aggregates like GROUP BY, HAVING may trigger hash or sort operations. Use indexes or pre-aggregated tables for heavy workloads.

11. Using `ANALYZE` for Actual Execution Times

EXPLAIN ANALYZE SELECT * FROM orders WHERE total > 1000;

Returns actual:

Row count
Execution time per step
Number of loops

Helps detect misestimates (e.g., 10k expected rows but 1 million actual).

12. Detecting Missing Indexes

Use EXPLAIN to find sequential scans on large tables:

EXPLAIN SELECT * FROM sales WHERE product_id = 100;

If Seq Scan appears, consider:

CREATE INDEX idx_sales_product_id ON sales(product_id);

13. Query Rewriting for Optimization

Instead of:

SELECT * FROM orders WHERE status != 'shipped';

Rewrite to:

SELECT * FROM orders WHERE status IN ('pending', 'cancelled');

Negations are hard to optimize — always prefer positive filtering.

14. Index Selection and Composite Indexes

Composite indexes help multi-column filters:

CREATE INDEX idx_customer_order_date ON orders(customer_id, order_date);

Useful for:

SELECT * FROM orders WHERE customer_id = 123 AND order_date > '2023-01-01';

Only the leftmost prefix is used unless all columns are included in filter.

15. Table Statistics and Autovacuum (PostgreSQL)

PostgreSQL’s query planner depends on table stats:

ANALYZE orders;

Autovacuum keeps stats up-to-date, but manual updates can help during large ETL jobs.

16. Partitioning and Clustering for Performance

Partitioning:

CREATE TABLE sales_y2024 PARTITION OF sales FOR VALUES FROM ('2024-01-01') TO ('2024-12-31');

Improves filter performance on date ranges.

Clustering:

CLUSTER orders USING idx_orders_customer_id;

Physically organizes data by index — faster reads for ordered scans.

17. Caching and Materialized Views

Use materialized views to cache expensive queries:

CREATE MATERIALIZED VIEW top_products AS
SELECT product_id, SUM(sales) FROM sales GROUP BY product_id;

Refresh as needed:

REFRESH MATERIALIZED VIEW top_products;

18. Performance Pitfalls and Anti-Patterns

SELECT * in large joins
Functions in WHERE clause (DATE(created_at)) disable index
Using DISTINCT instead of GROUP BY
Sorting large unsorted tables
Implicit type casts (e.g., WHERE id = '123' with id as INT)

19. Real-World Tuning Example

Before:

SELECT * FROM logs WHERE DATE(timestamp) = CURRENT_DATE;

Problem:

Disables index on timestamp

Optimized:

SELECT * FROM logs
WHERE timestamp >= CURRENT_DATE AND timestamp < CURRENT_DATE + INTERVAL '1 day';

Now uses index — dramatically faster.

20. Summary and What’s Next

EXPLAIN PLAN is your window into SQL’s execution logic. By interpreting plans, rewriting queries, and applying indexing strategies, you can significantly improve query performance and reduce compute costs.

Tags
SQL

Welcome to Syskool

Welcome to Syskool

Welcome to Syskool

Welcome to Syskool

SQL Performance Tuning with EXPLAIN PLAN: Understanding Query Optimization

Table of Contents

1. Introduction

2. Why SQL Performance Tuning Matters

3. What Is EXPLAIN PLAN?

4. Basic Usage of EXPLAIN PLAN

PostgreSQL:

MySQL:

Oracle:

5. Understanding Query Execution Stages

6. Interpreting Rows, Cost, Width, Time

7. Common Operators in Query Plans

8. Seq Scan vs Index Scan

9. Join Types and Their Cost Implications

10. Sorting, Filtering, Aggregation Overheads

11. Using `ANALYZE` for Actual Execution Times

12. Detecting Missing Indexes

13. Query Rewriting for Optimization

14. Index Selection and Composite Indexes

15. Table Statistics and Autovacuum (PostgreSQL)

16. Partitioning and Clustering for Performance

17. Caching and Materialized Views

18. Performance Pitfalls and Anti-Patterns

19. Real-World Tuning Example

20. Summary and What’s Next

LEAVE A REPLY Cancel reply

Subscribe for exclusive content

Welcome to Syskool

Welcome to Syskool

Welcome to Syskool

Subscribe to Syskool

Subscribe to Liberty Case

Welcome to Syskool

SQL Performance Tuning with EXPLAIN PLAN: Understanding Query Optimization

Table of Contents

1. Introduction

2. Why SQL Performance Tuning Matters

3. What Is EXPLAIN PLAN?

4. Basic Usage of EXPLAIN PLAN

PostgreSQL:

MySQL:

Oracle:

5. Understanding Query Execution Stages

6. Interpreting Rows, Cost, Width, Time

7. Common Operators in Query Plans

8. Seq Scan vs Index Scan

9. Join Types and Their Cost Implications

10. Sorting, Filtering, Aggregation Overheads

11. Using ANALYZE for Actual Execution Times

12. Detecting Missing Indexes

13. Query Rewriting for Optimization

14. Index Selection and Composite Indexes

15. Table Statistics and Autovacuum (PostgreSQL)

16. Partitioning and Clustering for Performance

17. Caching and Materialized Views

18. Performance Pitfalls and Anti-Patterns

19. Real-World Tuning Example

20. Summary and What’s Next

RELATED ARTICLES

Working with NoSQL-like Features in SQL: Arrays, HSTORE, and Semi-Structured Models

Version Control for SQL Scripts: Git, Database Migrations, and Change Management

SQL for Financial Analytics: Profit & Loss Reporting, Forecasting, and Ratio Analysis

LEAVE A REPLY Cancel reply

Subscribe for exclusive content

11. Using `ANALYZE` for Actual Execution Times