Home Blog Page 4

Using Constraints in SQL: NOT NULL, UNIQUE, CHECK, and DEFAULT

0
sql course

Table of Contents

  1. Introduction
  2. What Are Constraints in SQL?
  3. Why Use Constraints?
  4. NOT NULL Constraint
  5. UNIQUE Constraint
  6. CHECK Constraint
  7. DEFAULT Constraint
  8. Combining Multiple Constraints
  9. Adding Constraints to Existing Tables
  10. Dropping Constraints
  11. Naming Constraints for Clarity
  12. Column-Level vs Table-Level Constraints
  13. Real-World Example: Enforcing Data Validity
  14. Constraint Violation Errors
  15. Performance Impact of Constraints
  16. Difference Between UNIQUE and PRIMARY KEY
  17. Best Practices for Using Constraints
  18. Using Constraints in Insert Statements
  19. Compatibility in Different SQL Dialects
  20. Summary and What’s Next

1. Introduction

Constraints in SQL are rules applied to columns to enforce the correctness, validity, and integrity of the data stored in tables. They prevent bad data from entering your database and reduce the need for application-level validations.


2. What Are Constraints in SQL?

A constraint is a rule that restricts the values that can be stored in a column or set of columns. They are enforced by the database engine automatically.


3. Why Use Constraints?

  • Prevent invalid or inconsistent data
  • Maintain relational integrity
  • Reduce bugs caused by bad input
  • Ensure business logic enforcement at the database level
  • Improve confidence in data quality

4. NOT NULL Constraint

The NOT NULL constraint ensures that a column must always have a value — it cannot be NULL.

sqlCopyEditCREATE TABLE users (
  id INT PRIMARY KEY,
  name VARCHAR(100) NOT NULL
);

If you try to insert a row without a name, it will throw an error.


5. UNIQUE Constraint

The UNIQUE constraint ensures that all values in a column (or combination of columns) are distinct.

sqlCopyEditCREATE TABLE users (
  email VARCHAR(255) UNIQUE
);

You can also create multi-column uniqueness:

sqlCopyEditCONSTRAINT unique_user_email UNIQUE (user_id, email)

Unlike PRIMARY KEY, a table can have multiple UNIQUE constraints, and NULL values are usually allowed (depends on RDBMS).


6. CHECK Constraint

The CHECK constraint validates that data meets a specific condition.

sqlCopyEditCREATE TABLE employees (
  salary INT,
  CHECK (salary >= 0)
);

If someone tries to insert a negative salary, the operation fails.

You can also use complex conditions:

sqlCopyEditCHECK (status IN ('active', 'inactive', 'suspended'))

7. DEFAULT Constraint

The DEFAULT constraint assigns a default value if no value is specified during insertion.

sqlCopyEditCREATE TABLE users (
  is_active BOOLEAN DEFAULT TRUE
);

Now, if is_active is omitted, it will default to TRUE.


8. Combining Multiple Constraints

You can combine constraints on the same column:

sqlCopyEditCREATE TABLE accounts (
  account_id INT PRIMARY KEY,
  balance DECIMAL(10, 2) NOT NULL CHECK (balance >= 0) DEFAULT 0.00
);

This ensures no nulls, no negative balances, and initializes to zero.


9. Adding Constraints to Existing Tables

Use ALTER TABLE to add constraints later:

sqlCopyEditALTER TABLE users
ADD CONSTRAINT chk_age CHECK (age >= 18);

For NOT NULL:

sqlCopyEditALTER TABLE users
MODIFY age INT NOT NULL; -- MySQL

Or

sqlCopyEditALTER TABLE users
ALTER COLUMN age SET NOT NULL; -- PostgreSQL

10. Dropping Constraints

Constraints can be removed with ALTER TABLE:

sqlCopyEditALTER TABLE users
DROP CONSTRAINT chk_age; -- PostgreSQL, SQL Server

-- MySQL for UNIQUE
ALTER TABLE users
DROP INDEX email;

You must know the constraint name in some RDBMSs.


11. Naming Constraints for Clarity

Give custom names to constraints:

sqlCopyEditCONSTRAINT chk_positive_balance CHECK (balance >= 0)

Helps with debugging and managing schema migrations.


12. Column-Level vs Table-Level Constraints

  • Column-level: Defined right after the column name
  • Table-level: Defined after all columns

Example:

sqlCopyEdit-- Column-level
salary DECIMAL(10,2) CHECK (salary > 0)

-- Table-level
CHECK (salary > 0)

Use table-level for constraints involving multiple columns.


13. Real-World Example: Enforcing Data Validity

sqlCopyEditCREATE TABLE registrations (
  id SERIAL PRIMARY KEY,
  email VARCHAR(255) UNIQUE NOT NULL,
  age INT CHECK (age >= 18),
  registered_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

Ensures:

  • Email is unique and not null
  • Age is at least 18
  • Default registration timestamp is set

14. Constraint Violation Errors

If you violate a constraint, the database throws an error like:

  • ERROR: null value in column "name" violates not-null constraint
  • ERROR: duplicate key value violates unique constraint
  • ERROR: new row for relation violates check constraint

Always validate data before insertion or use TRY-CATCH or equivalent in application logic.


15. Performance Impact of Constraints

Constraints have minimal read overhead, but:

  • May slow inserts/updates due to validation
  • Improve query optimization (e.g., using UNIQUE for index)
  • Help databases infer cardinality during planning

16. Difference Between UNIQUE and PRIMARY KEY

FeaturePRIMARY KEYUNIQUE
Null allowed?NoYes (usually, varies)
Count per tableOneMultiple
Index behaviorAutomatically indexedAutomatically indexed

17. Best Practices for Using Constraints

  • Always use NOT NULL unless nulls are required
  • Define DEFAULT values for optional fields
  • Use CHECK to enforce business rules early
  • Name constraints clearly for future reference
  • Document constraint logic as part of schema design

18. Using Constraints in Insert Statements

sqlCopyEditINSERT INTO users (name) VALUES ('Alice'); -- `is_active` defaults to TRUE

Will fail if:

  • name is null and NOT NULL is enforced
  • A duplicate email is inserted under UNIQUE

19. Compatibility in Different SQL Dialects

ConstraintPostgreSQLMySQLSQL ServerOracle
NOT NULL
UNIQUE
CHECK⚠️ (enforced only in InnoDB 8+)
DEFAULT

20. Summary and What’s Next

SQL constraints like NOT NULL, UNIQUE, CHECK, and DEFAULT are essential tools for enforcing data integrity at the schema level. When used correctly, they reduce bugs, improve consistency, and push validation into the database layer.

Transactions in SQL: BEGIN, COMMIT, and ROLLBACK Explained

0
sql course

Table of Contents

  1. Introduction
  2. What Is a Transaction in SQL?
  3. Why Transactions Matter
  4. The ACID Properties
  5. Transaction Control Commands
  6. BEGIN or START TRANSACTION
  7. COMMIT – Finalizing Changes
  8. ROLLBACK – Undoing Changes
  9. Autocommit Behavior
  10. Nested Transactions and Savepoints
  11. Using SAVEPOINT and ROLLBACK TO
  12. Real-World Example: Bank Transfers
  13. Real-World Example: Inventory Management
  14. Transactions in Different RDBMS
  15. Isolation Levels Explained
  16. Common Errors and Misunderstandings
  17. Performance Considerations with Transactions
  18. Best Practices for Transaction Management
  19. When Not to Use Transactions
  20. Summary and What’s Next

1. Introduction

Transactions are the backbone of reliable database operations. They ensure that a series of SQL operations either complete entirely or not at all — preventing partial updates and ensuring consistency.


2. What Is a Transaction in SQL?

A transaction is a group of one or more SQL statements executed as a single unit. It guarantees that either all changes are applied or none are.


3. Why Transactions Matter

  • Prevents data corruption
  • Ensures consistency in business logic
  • Handles failures gracefully
  • Supports safe concurrent access

4. The ACID Properties

PropertyDescription
AtomicityAll or nothing – either full success or full undo
ConsistencyDatabase remains in a valid state
IsolationTransactions don’t interfere with each other
DurabilityCommitted changes persist even after crashes

5. Transaction Control Commands

CommandDescription
BEGIN / START TRANSACTIONStarts a new transaction
COMMITSaves all changes made in the transaction
ROLLBACKUndoes all changes made since last BEGIN

6. BEGIN or START TRANSACTION

Starts a new transaction block:

sqlCopyEditBEGIN; -- or START TRANSACTION;

After this point, changes are not visible to others until COMMIT.


7. COMMIT – Finalizing Changes

sqlCopyEditCOMMIT;

This makes all changes permanent and visible to others. Once committed, changes cannot be rolled back.


8. ROLLBACK – Undoing Changes

sqlCopyEditROLLBACK;

Reverts all changes made since the transaction began. Useful for error handling or unexpected input.


9. Autocommit Behavior

Most RDBMSs have autocommit enabled by default:

  • Every individual SQL statement is treated as a transaction
  • Use SET autocommit = 0; (MySQL) or explicitly call BEGIN; to override

10. Nested Transactions and Savepoints

Some databases support savepoints to simulate nested transactions.

sqlCopyEditBEGIN;

SAVEPOINT sp1;

-- Do something

ROLLBACK TO sp1; -- Rolls back only to the savepoint

COMMIT;

Useful in large or conditional operations.


11. Using SAVEPOINT and ROLLBACK TO

sqlCopyEditSAVEPOINT before_update;

UPDATE accounts SET balance = balance - 100 WHERE id = 1;

-- Something went wrong
ROLLBACK TO before_update;

-- Continue other safe actions
COMMIT;

12. Real-World Example: Bank Transfers

sqlCopyEditBEGIN;

UPDATE accounts SET balance = balance - 500 WHERE id = 1;
UPDATE accounts SET balance = balance + 500 WHERE id = 2;

COMMIT;

If one update fails, use ROLLBACK to prevent inconsistent funds transfer.


13. Real-World Example: Inventory Management

sqlCopyEditBEGIN;

UPDATE products SET stock = stock - 1 WHERE id = 101;
INSERT INTO order_items (order_id, product_id, quantity) VALUES (999, 101, 1);

COMMIT;

Prevents deducting stock if insert fails.


14. Transactions in Different RDBMS

RDBMSTransaction Command
MySQLSTART TRANSACTION, COMMIT, ROLLBACK
PostgreSQLBEGIN, COMMIT, ROLLBACK
SQL ServerBEGIN TRAN, COMMIT TRAN, ROLLBACK TRAN
OracleImplicit transactions; uses COMMIT / ROLLBACK

15. Isolation Levels Explained

Control how concurrent transactions interact:

LevelDescription
READ UNCOMMITTEDCan read uncommitted data (dirty reads)
READ COMMITTEDCan only read committed data
REPEATABLE READEnsures same rows can’t change mid-transaction
SERIALIZABLEHighest isolation; fully serial execution
sqlCopyEditSET TRANSACTION ISOLATION LEVEL REPEATABLE READ;

16. Common Errors and Misunderstandings

MistakeIssue
Forgetting COMMITLocks held, changes lost on disconnect
Using ROLLBACK too lateMay not undo already committed changes
Nested BEGINsNot supported in most RDBMS without savepoints

17. Performance Considerations with Transactions

  • Keep transactions short
  • Long transactions hold locks and reduce concurrency
  • Use appropriate isolation level to avoid blocking
  • Don’t open transactions without eventually committing or rolling back

18. Best Practices for Transaction Management

  • Always handle exceptions — use ROLLBACK on error
  • Use transactions for units of work that must succeed together
  • Wrap logic in stored procedures when needed
  • Avoid user interaction inside a transaction block
  • Test rollback paths during development

19. When Not to Use Transactions

  • For read-only SELECTs (unless in complex consistency scenarios)
  • For autonomous logging/auditing actions that must always persist
  • For non-critical operations that can fail independently

20. Summary and What’s Next

Transactions are fundamental to reliable and consistent SQL applications. By using BEGIN, COMMIT, and ROLLBACK wisely, you ensure that your application logic maintains data integrity — even in the face of failures or concurrency.

Performance Tuning in SQL: Indexes and Query Optimization

0
sql course

Table of Contents

  1. Introduction
  2. Why Performance Tuning Matters
  3. What Is Query Optimization?
  4. What Is an Index in SQL?
  5. Types of Indexes
  6. How Indexes Work Internally
  7. Benefits of Using Indexes
  8. When Not to Use Indexes
  9. Creating Indexes
  10. Composite Indexes
  11. Covering Indexes
  12. Using EXPLAIN or EXPLAIN PLAN
  13. Optimizing SELECT Queries
  14. Optimizing Joins
  15. Optimizing WHERE Clauses
  16. Avoiding SELECT *
  17. Using Appropriate Data Types
  18. Understanding Query Execution Plans
  19. Real-World Indexing Examples
  20. Summary and What’s Next

1. Introduction

As your database grows, query performance becomes critical. Poorly written queries and missing indexes can slow down applications. SQL performance tuning ensures that queries run efficiently, reducing load and improving response time.


2. Why Performance Tuning Matters

  • Faster query execution
  • Better user experience
  • Reduced resource consumption
  • Increased database scalability
  • Lower hardware costs

3. What Is Query Optimization?

Query optimization is the process of analyzing SQL queries and modifying them (or the database) to improve execution speed without changing output results.


4. What Is an Index in SQL?

An index is a data structure that speeds up the retrieval of rows from a table, much like a book index helps you find information quickly without reading the whole book.


5. Types of Indexes

Index TypeDescription
Single-columnIndex on one column
CompositeIndex on two or more columns
UniquePrevents duplicate values in the column(s)
Full-textOptimized for searching text
ClusteredAlters the way records are stored (SQL Server)
Non-clusteredStandard indexing without affecting storage order

6. How Indexes Work Internally

Most RDBMSs implement indexes using B-trees or hash tables. When a query uses a column with an index, the database searches the index tree rather than scanning the full table.


7. Benefits of Using Indexes

  • Speed up searches using WHERE, JOIN, ORDER BY
  • Improve sorting performance
  • Enable fast lookups for foreign keys
  • Help in aggregates and filtering

8. When Not to Use Indexes

  • On very small tables — full scan is faster
  • On columns with high update/delete frequency — may slow writes
  • On columns with low selectivity (e.g., boolean flags)

9. Creating Indexes

sqlCopyEdit-- Single-column index
CREATE INDEX idx_lastname ON employees(last_name);

-- Composite index
CREATE INDEX idx_dept_salary ON employees(department_id, salary);

Always name your indexes meaningfully for maintenance.


10. Composite Indexes

Use when queries filter or sort by multiple columns:

sqlCopyEdit-- Helps with queries using department and salary together
CREATE INDEX idx_dept_salary ON employees(department_id, salary);

Note: Order matters. An index on (department_id, salary) will help with:

  • WHERE department_id = ?
  • WHERE department_id = ? AND salary = ?

But not with:

  • WHERE salary = ? alone

11. Covering Indexes

A covering index includes all columns used in a query:

sqlCopyEdit-- Covers both WHERE and SELECT columns
CREATE INDEX idx_cover ON orders(order_date, status, total);

Avoids accessing the actual table — improves performance dramatically.


12. Using EXPLAIN or EXPLAIN PLAN

Use EXPLAIN to analyze how a query will execute:

sqlCopyEditEXPLAIN SELECT * FROM orders WHERE customer_id = 1001;

Key info to look for:

  • type: shows how rows are accessed (e.g., ALL, index, ref)
  • key: which index is used
  • rows: how many rows are examined

In PostgreSQL:

sqlCopyEditEXPLAIN ANALYZE SELECT * FROM products WHERE price > 100;

13. Optimizing SELECT Queries

  • Avoid SELECT * — only select needed columns
  • Use proper WHERE filters
  • Ensure filter columns are indexed
  • Use limits for pagination and test queries

14. Optimizing Joins

  • Ensure join keys are indexed
  • Use INNER JOIN when possible
  • Minimize row counts before joining large tables
  • Use explicit JOIN syntax instead of old-style comma joins

15. Optimizing WHERE Clauses

  • Use indexed columns in WHERE
  • Avoid using functions on columns:
sqlCopyEdit-- BAD: function on column prevents index use
WHERE YEAR(order_date) = 2024

-- GOOD: rewrite using range
WHERE order_date >= '2024-01-01' AND order_date < '2025-01-01'
  • Avoid OR conditions; use UNION or rewrite with IN

16. Avoiding SELECT *

sqlCopyEdit-- BAD
SELECT * FROM users;

-- GOOD
SELECT id, name, email FROM users;

Benefits:

  • Reduces memory use
  • Improves performance
  • Avoids unnecessary data transfer
  • Ensures index coverage in some queries

17. Using Appropriate Data Types

  • Use smallest possible type (e.g., TINYINT instead of INT)
  • Avoid overusing TEXT, BLOB, VARCHAR(MAX)
  • Normalize repeated data where necessary
  • Ensure consistent types in join/filter conditions

18. Understanding Query Execution Plans

Learn to interpret key parts of EXPLAIN output:

ElementMeaning
type = ALLFull table scan — optimize this!
key = NULLNo index used
rowsEstimated rows scanned
ExtraLook for “Using where”, “Using temporary”, etc.

19. Real-World Indexing Examples

Speeding up a dashboard:

sqlCopyEditCREATE INDEX idx_orders_date ON orders(order_date);

Optimizing user login:

sqlCopyEditCREATE INDEX idx_login ON users(email, password_hash);

Accelerating product filters:

sqlCopyEditCREATE INDEX idx_price_stock ON products(price, in_stock);

20. Summary and What’s Next

SQL performance tuning through indexes and query optimization is essential for building fast, scalable applications. Mastering indexing, writing efficient queries, and using tools like EXPLAIN allows you to reduce latency and improve the overall health of your database.

Using LEAD() and LAG() in SQL for Comparative Queries

0
sql course

Table of Contents

  1. Introduction
  2. What Are LEAD() and LAG() Functions?
  3. Why Use These Functions?
  4. Syntax of LEAD() and LAG()
  5. Basic LAG() Example
  6. Basic LEAD() Example
  7. Comparing Current and Previous/Next Rows
  8. Using PARTITION BY in LEAD()/LAG()
  9. Using ORDER BY Correctly
  10. Calculating Differences (Growth, Change)
  11. Real-World Example: Sales Trends Over Time
  12. Real-World Example: Employee Salary Change
  13. Handling NULLs in Missing Rows
  14. Using LAG()/LEAD() with Multiple Columns
  15. Nesting with CASE for Conditional Differences
  16. Performance Considerations
  17. Common Mistakes and Debugging Tips
  18. Best Practices for Comparative Window Queries
  19. Other Useful Lead-Lag Variants (FIRST_VALUE, LAST_VALUE)
  20. Summary and What’s Next

1. Introduction

When analyzing time series or sequential data, we often need to compare a row with its previous or next row. SQL’s LEAD() and LAG() window functions make such comparisons easy and elegant — without writing complex subqueries or joins.


2. What Are LEAD() and LAG() Functions?

  • LAG() accesses data from a previous row
  • LEAD() accesses data from a following row

Both functions are part of SQL’s window functions category and help track changes, trends, and comparisons across rows.


3. Why Use These Functions?

  • Calculate month-over-month sales change
  • Find salary change between job roles
  • Track movement in rankings
  • Compare consecutive values in logs or events

4. Syntax of LEAD() and LAG()

sqlCopyEditLEAD(column, offset, default) OVER (
  PARTITION BY column
  ORDER BY column
)

LAG(column, offset, default) OVER (
  PARTITION BY column
  ORDER BY column
)
  • offset = how many rows ahead/behind (default: 1)
  • default = value if there’s no row (e.g., first/last row)

5. Basic LAG() Example

sqlCopyEditSELECT name, salary,
  LAG(salary, 1) OVER (ORDER BY hire_date) AS previous_salary
FROM employees;

Adds a column showing the previous employee’s salary by hire date.


6. Basic LEAD() Example

sqlCopyEditSELECT name, salary,
  LEAD(salary, 1) OVER (ORDER BY hire_date) AS next_salary
FROM employees;

Adds a column showing the next employee’s salary.


7. Comparing Current and Previous/Next Rows

sqlCopyEditSELECT name, salary,
  salary - LAG(salary) OVER (ORDER BY hire_date) AS salary_diff
FROM employees;

Calculates difference from the previous row.


8. Using PARTITION BY in LEAD()/LAG()

sqlCopyEditSELECT department, name, salary,
  LAG(salary) OVER (PARTITION BY department ORDER BY salary DESC) AS prev_salary
FROM employees;

Applies comparison within each department.


9. Using ORDER BY Correctly

ORDER BY defines the row sequence for comparison.
Wrong ordering leads to incorrect lead/lag logic.

sqlCopyEdit-- Correct: Compare salaries over time
ORDER BY hire_date

-- Wrong: Alphabetical order may not reflect real timeline
ORDER BY name

10. Calculating Differences (Growth, Change)

sqlCopyEditSELECT product_id, month, sales,
  sales - LAG(sales) OVER (PARTITION BY product_id ORDER BY month) AS change
FROM monthly_sales;

This shows monthly change in sales per product.


11. Real-World Example: Sales Trends Over Time

sqlCopyEditSELECT store_id, sale_date, revenue,
  revenue - LAG(revenue) OVER (PARTITION BY store_id ORDER BY sale_date) AS revenue_diff
FROM store_sales;

Helps in building trend analysis dashboards.


12. Real-World Example: Employee Salary Change

sqlCopyEditSELECT employee_id, salary, title, effective_date,
  LAG(salary) OVER (PARTITION BY employee_id ORDER BY effective_date) AS previous_salary,
  salary - LAG(salary) OVER (PARTITION BY employee_id ORDER BY effective_date) AS raise
FROM employee_history;

Detects salary increments or demotions over time.


13. Handling NULLs in Missing Rows

Use the default argument:

sqlCopyEditLAG(salary, 1, 0) OVER (...) -- Defaults to 0 instead of NULL

This is useful when you want to calculate metrics like growth percentages safely.


14. Using LAG()/LEAD() with Multiple Columns

Each column must have its own LAG()/LEAD():

sqlCopyEditSELECT id, month,
  LAG(sales) OVER (ORDER BY month) AS prev_sales,
  LAG(profit) OVER (ORDER BY month) AS prev_profit
FROM business_data;

15. Nesting with CASE for Conditional Differences

sqlCopyEditSELECT name, department, salary,
  CASE
    WHEN salary > LAG(salary) OVER (PARTITION BY department ORDER BY salary)
    THEN 'Increased'
    WHEN salary < LAG(salary) OVER (PARTITION BY department ORDER BY salary)
    THEN 'Decreased'
    ELSE 'Same'
  END AS trend
FROM employees;

Labels each salary change as an increase, decrease, or same.


16. Performance Considerations

  • Ensure appropriate indexes on ORDER BY columns
  • Use filters or limits with large datasets
  • Avoid using LAG()/LEAD() in WHERE directly — use CTEs or subqueries

17. Common Mistakes and Debugging Tips

MistakeIssue
No ORDER BY in window clauseNo defined sequence — results will be inconsistent
Wrong PARTITION BYComparisons happen across unrelated groups
Forgetting default valueNULLs can skew results in calculations
Using in WHERE directlyWindow functions are evaluated after WHERE

18. Best Practices for Comparative Window Queries

  • Always define ORDER BY explicitly
  • Use PARTITION BY when comparing within groups
  • Alias calculated differences clearly (sales_diff, rank_gap)
  • Wrap complex lead/lag logic inside CTEs for reusability
  • Test edge cases (first/last row, NULLs)

19. Other Useful Lead-Lag Variants

  • FIRST_VALUE() – First value in window
  • LAST_VALUE() – Last value in window
  • NTH_VALUE(column, n) – nth value in the window

These work with OVER(PARTITION BY … ORDER BY …) just like LEAD() and LAG().


20. Summary and What’s Next

LEAD() and LAG() are invaluable for comparing sequential rows in SQL — perfect for detecting trends, changes, and patterns. They simplify logic that would otherwise require self-joins or subqueries, making SQL both cleaner and more powerful.

Window Functions in SQL: ROW_NUMBER(), RANK(), and DENSE_RANK() Explained

0
sql course

Table of Contents

  1. Introduction
  2. What Are Window Functions?
  3. Why Use Window Functions?
  4. Basic Syntax of a Window Function
  5. Introduction to ROW_NUMBER()
  6. Example: Using ROW_NUMBER()
  7. Introduction to RANK()
  8. Example: Using RANK()
  9. Introduction to DENSE_RANK()
  10. Example: Using DENSE_RANK()
  11. Key Differences Between the Three
  12. Using PARTITION BY with Window Functions
  13. Using ORDER BY in Window Functions
  14. Real-World Use Case: Top N Records per Group
  15. Pagination with ROW_NUMBER()
  16. Filtering First Row in Each Partition
  17. Performance Tips for Window Functions
  18. Common Errors and Debugging
  19. Best Practices for Window Function Usage
  20. Summary and What’s Next

1. Introduction

Window functions are a powerful feature in SQL that allow calculations across a set of table rows related to the current row, without collapsing the result set. Unlike aggregates, they maintain row-level granularity.


2. What Are Window Functions?

A window function performs a calculation across a set of rows (called a “window”) related to the current row. The syntax always includes the OVER() clause, which defines the window.


3. Why Use Window Functions?

  • Add ranking, running totals, moving averages
  • Get row numbers per partition
  • Compare current row with previous/next
  • Avoid subqueries for common reporting patterns

4. Basic Syntax of a Window Function

sqlCopyEditfunction_name() OVER (
  PARTITION BY column1
  ORDER BY column2
)

You can use it in the SELECT clause like any other expression.


5. Introduction to ROW_NUMBER()

ROW_NUMBER() assigns a unique sequential number to each row within a partition, ordered by a specified column.


6. Example: Using ROW_NUMBER()

sqlCopyEditSELECT name, department, salary,
  ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS rank
FROM employees;

This gives each employee a unique rank within their department, sorted by salary.


7. Introduction to RANK()

RANK() assigns a ranking within a partition, but ties get the same rank, and it skips numbers after ties.


8. Example: Using RANK()

sqlCopyEditSELECT name, department, salary,
  RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS rank
FROM employees;

If two employees have the same salary, they get the same rank, and the next rank is skipped.


9. Introduction to DENSE_RANK()

DENSE_RANK() is similar to RANK() but doesn’t skip ranks after ties.


10. Example: Using DENSE_RANK()

sqlCopyEditSELECT name, department, salary,
  DENSE_RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS dense_rank
FROM employees;

If two employees tie, they share the same rank, and the next rank increments by one, not skipping any numbers.


11. Key Differences Between the Three

FunctionUnique?Ties?Gaps in Rank?
ROW_NUMBER()YesNo tiesNo
RANK()NoTies shareYes
DENSE_RANK()NoTies shareNo

12. Using PARTITION BY with Window Functions

Partitioning divides rows into subsets:

sqlCopyEdit... OVER (PARTITION BY department ORDER BY salary DESC)

Calculates rankings within each department.


13. Using ORDER BY in Window Functions

Determines the sequence of numbering:

sqlCopyEdit... OVER (ORDER BY order_date)

Without ORDER BY, row numbering becomes arbitrary and unpredictable.


14. Real-World Use Case: Top N Records per Group

sqlCopyEditWITH ranked_sales AS (
  SELECT employee_id, region, amount,
    RANK() OVER (PARTITION BY region ORDER BY amount DESC) AS rk
  FROM sales
)
SELECT * FROM ranked_sales WHERE rk <= 3;

Returns top 3 sales per region.


15. Pagination with ROW_NUMBER()

sqlCopyEditWITH paginated AS (
  SELECT *, ROW_NUMBER() OVER (ORDER BY id) AS row_num
  FROM products
)
SELECT * FROM paginated WHERE row_num BETWEEN 11 AND 20;

Perfect for implementing offset-based pagination.


16. Filtering First Row in Each Partition

sqlCopyEditWITH ranked_employees AS (
  SELECT *, ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS rn
  FROM employees
)
SELECT * FROM ranked_employees WHERE rn = 1;

Returns highest-paid employee from each department.


17. Performance Tips for Window Functions

  • Use indexes on ORDER BY columns
  • Limit result sets when possible
  • Avoid using window functions in filters directly — wrap in CTEs
  • Window functions are faster than correlated subqueries in most cases

18. Common Errors and Debugging

ErrorCause
Window function without OVER()Syntax error
Using window function in WHERENot allowed — use CTE or subquery instead
Unexpected resultsCheck PARTITION BY and ORDER BY logic

19. Best Practices for Window Function Usage

  • Always use PARTITION BY if ranking within a group
  • Use meaningful aliases (row_num, rank, etc.)
  • Test RANK() vs DENSE_RANK() for ties
  • Wrap window logic in a CTE for reusability and clarity
  • Never assume row order without explicit ORDER BY

20. Summary and What’s Next

Window functions like ROW_NUMBER(), RANK(), and DENSE_RANK() empower SQL developers to write advanced analytics queries without subqueries. They’re indispensable for ranking, pagination, and grouped calculations.