Home Blog Page 6

String Functions in SQL: CONCAT, LENGTH, REPLACE and More

0
sql course

Table of Contents

  1. Introduction
  2. Why String Functions Matter
  3. Overview of SQL String Functions
  4. CONCAT() – Combine Strings
  5. LENGTH() – Get String Length
  6. CHAR_LENGTH() vs LENGTH()
  7. REPLACE() – Substitute Substrings
  8. UPPER() and LOWER() – Case Conversion
  9. TRIM() – Remove Leading/Trailing Spaces
  10. LTRIM() and RTRIM() – Left/Right Trimming
  11. SUBSTRING() – Extract Part of a String
  12. LEFT() and RIGHT() – Characters from Edges
  13. INSTR() or POSITION() – Find Substring Location
  14. LOCATE() in MySQL
  15. REPEAT() – Repeat Strings
  16. CONCAT_WS() – Concatenate with Separator
  17. Real-World Example: Formatting Names
  18. Real-World Example: Cleaning Phone Numbers
  19. Best Practices for String Manipulation
  20. Summary and What’s Next

1. Introduction

Strings (text data) are everywhere in databases — names, addresses, product codes, emails, etc. SQL provides string functions to manipulate, analyze, and transform this data effectively for reporting, cleaning, and business logic.


2. Why String Functions Matter

String functions let you:

  • Combine fields (like first + last name)
  • Format and clean data
  • Extract or substitute text
  • Prepare strings for display or comparison
  • Enable fuzzy matching and reporting

3. Overview of SQL String Functions

Here are some of the most commonly used string functions:

FunctionPurpose
CONCAT()Join multiple strings
LENGTH()Return number of bytes
CHAR_LENGTH()Return number of characters
REPLACE()Replace substring
TRIM()Remove whitespace
UPPER()Convert to uppercase
LOWER()Convert to lowercase
SUBSTRING()Extract part of a string

4. CONCAT() – Combine Strings

sqlCopyEditSELECT CONCAT(first_name, ' ', last_name) AS full_name
FROM employees;

You can also combine more than two strings:

sqlCopyEditSELECT CONCAT(city, ', ', state, ' - ', zip_code) AS full_address;

5. LENGTH() – Get String Length

sqlCopyEditSELECT LENGTH('SQL Rocks'); -- Returns 9 (bytes)

In some systems, LENGTH() counts bytes, not characters. For multi-byte encodings like UTF-8, use CHAR_LENGTH() instead.


6. CHAR_LENGTH() vs LENGTH()

FunctionDescription
CHAR_LENGTH()Returns character count
LENGTH()Returns byte count (may differ in UTF)

Example:

sqlCopyEditSELECT CHAR_LENGTH('नमस्ते'), LENGTH('नमस्ते');

7. REPLACE() – Substitute Substrings

sqlCopyEditSELECT REPLACE('Welcome to 2023', '2023', '2024');
-- Output: Welcome to 2024

Can also be used for cleaning data (e.g., removing dashes in phone numbers).


8. UPPER() and LOWER() – Case Conversion

sqlCopyEditSELECT UPPER(name), LOWER(email)
FROM users;

Useful for standardizing text for comparison or display.


9. TRIM() – Remove Leading/Trailing Spaces

sqlCopyEditSELECT TRIM('   Hello World   '); -- Outputs: 'Hello World'

Helps clean up user-entered data.


10. LTRIM() and RTRIM() – Trim Sides

sqlCopyEditSELECT LTRIM('   SQL');  -- Output: 'SQL'
SELECT RTRIM('SQL   ');  -- Output: 'SQL'

Trim only from the left or right.


11. SUBSTRING() – Extract Part of a String

sqlCopyEditSELECT SUBSTRING('PostgreSQL', 1, 4); -- Output: 'Post'

In MySQL: SUBSTRING(string, position, length)
In SQL Server: SUBSTRING(string, start, length)


12. LEFT() and RIGHT() – Characters from Edges

sqlCopyEditSELECT LEFT('Database', 4);  -- Output: 'Data'
SELECT RIGHT('Database', 4); -- Output: 'base'

Quickly grab a fixed number of characters from either side.


13. INSTR() or POSITION() – Find Substring Location

sqlCopyEditSELECT INSTR('SQL Tutorial', 'Tutorial'); -- Output: 5

Returns position of substring. Returns 0 if not found.


14. LOCATE() in MySQL

sqlCopyEditSELECT LOCATE('lo', 'Hello World'); -- Output: 4

LOCATE(substr, str) is MySQL’s alternative to INSTR().


15. REPEAT() – Repeat Strings

sqlCopyEditSELECT REPEAT('*', 10); -- Output: '**********'

Useful for creating placeholders, testing, or formatting.


16. CONCAT_WS() – Concatenate with Separator

sqlCopyEditSELECT CONCAT_WS('-', '2024', '05', '12'); -- Output: '2024-05-12'

WS = “With Separator”. More elegant than CONCAT() + manual separators.


17. Real-World Example: Formatting Names

sqlCopyEditSELECT CONCAT(UPPER(SUBSTRING(first_name, 1, 1)), LOWER(SUBSTRING(first_name, 2))) AS formatted_name
FROM users;

Capitalizes first letter, lowercase the rest — e.g., ‘john’ → ‘John’.


18. Real-World Example: Cleaning Phone Numbers

sqlCopyEditSELECT REPLACE(REPLACE(phone, '-', ''), ' ', '') AS cleaned_phone
FROM contacts;

Removes dashes and spaces for standardization.


19. Best Practices for String Manipulation

  • Use CHAR_LENGTH() when working with multilingual data
  • Always sanitize input before storing in string columns
  • Normalize text case using LOWER() or UPPER() for comparison
  • Avoid excessive nested string operations — readability matters
  • Consider using stored procedures or views for complex logic

20. Summary and What’s Next

SQL string functions are powerful tools for cleaning, transforming, and presenting text data. Whether you’re formatting names, building display strings, or standardizing input, mastering these functions will help you handle textual data more effectively.

Conditional Logic with CASE Statements in SQL

0
sql course

Table of Contents

  1. Introduction
  2. What is a CASE Statement?
  3. Syntax of CASE Statement
  4. Simple CASE vs Searched CASE
  5. Using CASE in SELECT
  6. Using CASE in ORDER BY
  7. Using CASE in WHERE
  8. Nested CASE Statements
  9. Using CASE with Aggregates
  10. Combining CASE with GROUP BY
  11. Real-World Example: Categorizing Salaries
  12. Real-World Example: Grading System
  13. Formatting Output with CASE
  14. Handling NULLs with CASE
  15. CASE with JOIN Conditions
  16. Limitations of CASE
  17. CASE vs IF vs IIF
  18. Common Errors in CASE Usage
  19. Best Practices for Writing CASE
  20. Summary and What’s Next

1. Introduction

The CASE statement is SQL’s way of implementing conditional logic — similar to if-else statements in programming. It allows you to return different values or apply different logic depending on specific conditions.


2. What is a CASE Statement?

A CASE statement evaluates conditions and returns a value based on the first condition that is true. It’s used to transform values, categorize data, or apply logic directly in queries.


3. Syntax of CASE Statement

Searched CASE:

sqlCopyEditCASE 
  WHEN condition1 THEN result1
  WHEN condition2 THEN result2
  ...
  ELSE default_result
END

Simple CASE:

sqlCopyEditCASE expression
  WHEN value1 THEN result1
  WHEN value2 THEN result2
  ...
  ELSE default_result
END

4. Simple CASE vs Searched CASE

TypeUse When…
Simple CASEComparing one expression to multiple values
Searched CASEUsing complex conditions or comparisons

5. Using CASE in SELECT

sqlCopyEditSELECT name,
  CASE 
    WHEN salary > 100000 THEN 'High'
    WHEN salary BETWEEN 50000 AND 100000 THEN 'Medium'
    ELSE 'Low'
  END AS salary_band
FROM employees;

Categorizes employees based on salary.


6. Using CASE in ORDER BY

sqlCopyEditSELECT name, status
FROM customers
ORDER BY 
  CASE 
    WHEN status = 'Premium' THEN 1
    WHEN status = 'Standard' THEN 2
    ELSE 3
  END;

Custom sort order based on business logic.


7. Using CASE in WHERE

Not common but possible:

sqlCopyEditSELECT * 
FROM orders
WHERE 
  CASE 
    WHEN priority = 'High' THEN shipped = 'Yes'
    ELSE 1=1
  END;

Often better to use standard WHERE + OR clauses for readability.


8. Nested CASE Statements

sqlCopyEditSELECT name,
  CASE 
    WHEN role = 'Manager' THEN 
      CASE 
        WHEN department = 'HR' THEN 'HR Manager'
        ELSE 'Manager'
      END
    ELSE 'Employee'
  END AS title
FROM staff;

Useful for multi-layered logic.


9. Using CASE with Aggregates

sqlCopyEditSELECT 
  COUNT(CASE WHEN gender = 'Male' THEN 1 END) AS males,
  COUNT(CASE WHEN gender = 'Female' THEN 1 END) AS females
FROM users;

Conditional counting using CASE inside aggregate functions.


10. Combining CASE with GROUP BY

sqlCopyEditSELECT 
  CASE 
    WHEN age < 18 THEN 'Minor'
    WHEN age BETWEEN 18 AND 65 THEN 'Adult'
    ELSE 'Senior'
  END AS age_group,
  COUNT(*) AS total
FROM citizens
GROUP BY 
  CASE 
    WHEN age < 18 THEN 'Minor'
    WHEN age BETWEEN 18 AND 65 THEN 'Adult'
    ELSE 'Senior'
  END;

Group data by dynamically derived categories.


11. Real-World Example: Categorizing Salaries

sqlCopyEditSELECT department,
  COUNT(*) AS total,
  SUM(CASE WHEN salary > 100000 THEN 1 ELSE 0 END) AS high_earners
FROM employees
GROUP BY department;

12. Real-World Example: Grading System

sqlCopyEditSELECT name,
  CASE 
    WHEN marks >= 90 THEN 'A'
    WHEN marks >= 75 THEN 'B'
    WHEN marks >= 60 THEN 'C'
    ELSE 'D'
  END AS grade
FROM students;

Transforms numeric scores into letter grades.


13. Formatting Output with CASE

sqlCopyEditSELECT name,
  CASE 
    WHEN status IS NULL THEN 'Unknown'
    ELSE status
  END AS formatted_status
FROM users;

Used to clean up NULLs or inconsistent data.


14. Handling NULLs with CASE

sqlCopyEditSELECT name,
  CASE 
    WHEN email IS NULL THEN 'Email Missing'
    ELSE email
  END AS contact_email
FROM customers;

Always consider NULL explicitly if necessary.


15. CASE with JOIN Conditions

sqlCopyEditSELECT e.name, d.name AS department
FROM employees e
JOIN departments d ON e.dept_id = 
  CASE 
    WHEN e.role = 'Intern' THEN NULL
    ELSE d.id
  END;

Avoid complex logic in JOIN conditions unless needed.


16. Limitations of CASE

  • Cannot return multiple columns
  • Cannot perform actions (e.g., DML like INSERT)
  • Complex nesting can reduce readability
  • Cannot be used outside of SQL statements (not procedural)

17. CASE vs IF vs IIF

FunctionUse InDescription
CASEStandard SQLMost flexible and portable
IFMySQLSimpler, limited to binary logic
IIFSQL ServerShorthand for two-condition CASE

18. Common Errors in CASE Usage

ErrorCause
Mixing data typesInconsistent return types in THEN clauses
Missing ENDEvery CASE must end with END
Using CASE in invalid contextOnly valid in expressions (e.g., not as standalone statements)

19. Best Practices for Writing CASE

  • Use searched CASE for complex logic
  • Always include an ELSE for unexpected values
  • Alias the CASE result with meaningful names
  • Use indentation for readability
  • Avoid nesting more than two levels deep

20. Summary and What’s Next

The CASE statement is SQL’s powerful conditional logic tool, allowing you to transform, categorize, and compute values dynamically. Whether you’re cleaning data, computing derived columns, or customizing output, mastering CASE enhances the flexibility of your queries.

Nested Subqueries and Derived Tables in SQL

0
sql course

Table of Contents

  1. Introduction
  2. What is a Nested Subquery?
  3. What is a Derived Table?
  4. Key Differences Between Nested Subqueries and Derived Tables
  5. When to Use Nested Subqueries
  6. When to Use Derived Tables
  7. Syntax of Nested Subqueries
  8. Syntax of Derived Tables (Subquery in FROM)
  9. Scalar vs Table Subqueries
  10. Nested Subqueries in SELECT
  11. Nested Subqueries in WHERE
  12. Nested Subqueries in HAVING
  13. Derived Tables in FROM Clause
  14. Using Aliases in Derived Tables
  15. Aggregation in Derived Tables
  16. Derived Table vs CTE
  17. Performance Implications
  18. Common Mistakes to Avoid
  19. Best Practices
  20. Summary and What’s Next

1. Introduction

When working with complex datasets, we often need intermediate results within our SQL queries. Nested subqueries and derived tables allow us to write modular, readable, and efficient queries by layering logic within the same statement.


2. What is a Nested Subquery?

A nested subquery is a query inside another query. It can appear in SELECT, FROM, WHERE, or HAVING clauses, and it allows for step-by-step logical evaluations.


3. What is a Derived Table?

A derived table is a subquery used in the FROM clause, and it acts like a temporary table that you can immediately query from, join with, or filter further.


4. Key Differences Between Nested Subqueries and Derived Tables

FeatureNested SubqueryDerived Table
LocationCan be used in SELECT, WHEREUsed in the FROM clause
Alias RequiredNoYes (must assign an alias)
Reusable in same queryNoYes (acts like a table)
ReadabilityLess readable for complex logicMore modular and readable

5. When to Use Nested Subqueries

Use when:

  • Filtering based on aggregates
  • Selecting a single value (scalar subquery)
  • Comparing a value from the outer query against a result from an inner one

6. When to Use Derived Tables

Use when:

  • You need to join with intermediate results
  • You want to reuse aggregated or transformed data
  • You want to improve query structure and readability

7. Syntax of Nested Subqueries

Example:

sqlCopyEditSELECT name
FROM employees
WHERE salary > (
  SELECT AVG(salary) FROM employees
);

This returns employees earning above the average salary.


8. Syntax of Derived Tables (Subquery in FROM)

Example:

sqlCopyEditSELECT dept, avg_salary
FROM (
  SELECT department AS dept, AVG(salary) AS avg_salary
  FROM employees
  GROUP BY department
) AS dept_avg
WHERE avg_salary > 60000;

This returns departments with above-average salaries.


9. Scalar vs Table Subqueries

  • Scalar Subquery: Returns one value (used in SELECT, WHERE)
  • Table Subquery (Derived Table): Returns multiple rows and columns (used in FROM)

10. Nested Subqueries in SELECT

sqlCopyEditSELECT name,
  (SELECT AVG(salary) FROM employees) AS company_avg
FROM employees;

Shows each employee with the overall average salary.


11. Nested Subqueries in WHERE

sqlCopyEditSELECT name
FROM customers
WHERE id IN (
  SELECT customer_id FROM orders WHERE total > 500
);

Filters customers based on matching order records.


12. Nested Subqueries in HAVING

sqlCopyEditSELECT department, AVG(salary) AS avg_salary
FROM employees
GROUP BY department
HAVING AVG(salary) > (
  SELECT AVG(salary) FROM employees
);

Filters departments with above-company-average salaries.


13. Derived Tables in FROM Clause

sqlCopyEditSELECT t.category, MAX(t.sales) AS top_sale
FROM (
  SELECT category, SUM(amount) AS sales
  FROM orders
  GROUP BY category
) AS t
GROUP BY t.category;

You can use aggregation, filtering, and joins with derived tables.


14. Using Aliases in Derived Tables

Mandatory: Every derived table must have an alias.

sqlCopyEditFROM (SELECT …) AS alias_name

Without an alias, the query will throw an error.


15. Aggregation in Derived Tables

You can create derived tables to group and summarize data:

sqlCopyEditSELECT d.dept, COUNT(*) AS num_employees
FROM (
  SELECT department AS dept
  FROM employees
  WHERE status = 'active'
) AS d
GROUP BY d.dept;

16. Derived Table vs CTE

FeatureDerived TableCommon Table Expression (CTE)
Defined inFROM clauseWITH clause
ReusabilityLimited to single instanceCan be referenced multiple times
ReadabilityMedium (embedded in FROM)High (defined upfront)
PortabilityUniversalNot supported in all older databases

17. Performance Implications

  • Nested subqueries can be slow if not optimized (especially correlated ones)
  • Derived tables are often more efficient and readable
  • Use indexes wherever possible in subqueries
  • Consider query planner hints for large subqueries in production systems

18. Common Mistakes to Avoid

MistakeImpact
Missing alias in derived tablesSyntax error
Subquery returns multiple rowsFails in scalar contexts (= instead of IN)
Correlated subqueries on large tablesVery slow queries
Unclear nesting logicConfusing or unreadable queries

19. Best Practices

  • Use derived tables for modular and clean queries
  • Alias all subqueries clearly
  • Avoid deeply nested subqueries unless necessary
  • Use scalar subqueries sparingly on large datasets
  • If complexity grows, refactor into CTEs

20. Summary and What’s Next

Nested subqueries and derived tables are invaluable for creating layered, modular queries. Use nested subqueries when filtering or computing one-off values. Use derived tables when creating reusable or joinable sets of intermediate data. Mastering both helps you handle complex SQL logic effectively.

Subqueries vs Joins: When to Use Each in SQL

0
sql course

Table of Contents

  1. Introduction
  2. What is a Subquery?
  3. What is a Join?
  4. Syntax Comparison: Subquery vs Join
  5. When to Use Subqueries
  6. When to Use Joins
  7. Types of Subqueries
  8. Types of Joins
  9. Performance Considerations
  10. Readability and Maintainability
  11. Real-World Example: Orders and Customers
  12. Filtering with Subqueries
  13. Filtering with Joins
  14. Aggregations in Subqueries vs Joins
  15. Correlated Subqueries vs Joins
  16. Nested Joins vs Nested Subqueries
  17. Combining Subqueries with Joins
  18. Common Mistakes
  19. Best Practices
  20. Summary and What’s Next

1. Introduction

In SQL, there are often multiple ways to solve the same problem. Two of the most powerful and flexible techniques are subqueries and joins. Understanding when to use each leads to cleaner, faster, and more maintainable queries.


2. What is a Subquery?

A subquery is a SQL query nested inside another query. It can return a single value, a list of values, or even a complete table.

Example:

sqlCopyEditSELECT name FROM customers
WHERE id IN (SELECT customer_id FROM orders WHERE total > 1000);

3. What is a Join?

A join combines rows from two or more tables based on a related column between them, usually a foreign key.

Example:

sqlCopyEditSELECT c.name, o.total
FROM customers c
JOIN orders o ON c.id = o.customer_id
WHERE o.total > 1000;

4. Syntax Comparison: Subquery vs Join

OperationSubquery ExampleJoin Example
FilteringWHERE id IN (SELECT …)JOIN … ON … + WHERE
AggregationSELECT … FROM (SELECT AVG() FROM …)JOIN with GROUP BY and aggregate function
ExistenceWHERE EXISTS (SELECT 1 FROM …)LEFT JOIN + IS NOT NULL or COUNT(*) > 0

5. When to Use Subqueries

✅ Use subqueries when:

  • You need to filter based on an aggregated value
  • The logic is better modularized in a separate query
  • The outer query should depend on dynamic values from inner query
  • You want encapsulation for readability

6. When to Use Joins

✅ Use joins when:

  • You want to retrieve data from multiple tables
  • You need columns from more than one table
  • You’re writing reporting, dashboards, or summaries
  • You need high-performance queries (joins are often faster)

7. Types of Subqueries

TypeDescription
Scalar SubqueryReturns a single value
Column SubqueryReturns a single column of multiple rows
Table SubqueryReturns a full table (often used with FROM)
Correlated SubqueryReferences outer query in the subquery

8. Types of Joins

Join TypeDescription
INNER JOINMatching rows from both tables
LEFT JOINAll rows from left + matches from right
RIGHT JOINAll rows from right + matches from left
FULL OUTER JOINAll rows from both, matched or not
CROSS JOINCartesian product (all combinations)

9. Performance Considerations

FeatureSubqueriesJoins
SpeedSlower if not optimizedFaster with proper indexing
ExecutionOften executed row-by-rowUses set-based operations
IndexingLimited influenceIndexes highly effective

Note: Correlated subqueries can be significantly slower on large datasets.


10. Readability and Maintainability

  • Subqueries improve modularity: Ideal for breaking logic into steps
  • Joins improve transparency: Ideal for combining related entities into reports
  • Deeply nested subqueries may reduce readability
  • Multi-join queries may be hard to follow without aliases and comments

11. Real-World Example: Orders and Customers

With JOIN:

sqlCopyEditSELECT c.name, o.total
FROM customers c
JOIN orders o ON c.id = o.customer_id;

With Subquery:

sqlCopyEditSELECT name
FROM customers
WHERE id IN (SELECT customer_id FROM orders);

Choose JOIN if you need customer and order data; use a subquery if you only need to check presence or filter.


12. Filtering with Subqueries

sqlCopyEditSELECT name FROM employees
WHERE salary > (
  SELECT AVG(salary) FROM employees
);

Useful for comparing values to aggregates.


13. Filtering with Joins

sqlCopyEditSELECT e.name
FROM employees e
JOIN (
  SELECT department_id, AVG(salary) AS avg_salary
  FROM employees
  GROUP BY department_id
) dept_avg ON e.department_id = dept_avg.department_id
WHERE e.salary > dept_avg.avg_salary;

More flexible and readable for grouped comparisons.


14. Aggregations in Subqueries vs Joins

Subquery:

sqlCopyEditSELECT name
FROM products
WHERE price > (SELECT AVG(price) FROM products);

Join:

sqlCopyEditSELECT p.name
FROM products p
JOIN (
  SELECT AVG(price) AS avg_price FROM products
) avg_table ON 1=1
WHERE p.price > avg_table.avg_price;

Both return the same — choose based on preference or reusability.


15. Correlated Subqueries vs Joins

sqlCopyEdit-- Correlated subquery
SELECT name
FROM employees e
WHERE salary > (
  SELECT AVG(salary)
  FROM employees
  WHERE department = e.department
);

Join equivalent:

sqlCopyEditSELECT e.name
FROM employees e
JOIN (
  SELECT department, AVG(salary) AS avg_salary
  FROM employees
  GROUP BY department
) dept_avg ON e.department = dept_avg.department
WHERE e.salary > dept_avg.avg_salary;

Joins are usually faster and easier to debug.


16. Nested Joins vs Nested Subqueries

  • Nested joins: Multiple joins between multiple tables — great for flat data retrieval
  • Nested subqueries: Good for step-by-step data transformation, particularly when isolation is required

17. Combining Subqueries with Joins

sqlCopyEditSELECT u.name, o.total
FROM users u
JOIN (
  SELECT * FROM orders WHERE total > 500
) o ON u.id = o.user_id;

This filters orders first using a subquery, then joins to users.


18. Common Mistakes

MistakeExplanation
Using subqueries when join is betterRedundant complexity or poor performance
Forgetting correlation aliasLeads to syntax error or wrong logic
Misaligning return typesScalar subquery returning multiple values
Over-nesting subqueriesDifficult to read and maintain

19. Best Practices

  • Use joins when retrieving fields from multiple tables
  • Use subqueries for filters, aggregates, and modular logic
  • Avoid correlated subqueries for large datasets
  • Always benchmark performance when query speed matters
  • Use descriptive aliases in joins and subqueries

20. Summary and What’s Next

Both subqueries and joins are foundational SQL techniques. Subqueries shine in filtering and encapsulated logic, while joins are better for combining related data. A skilled SQL developer knows when to use which — based on performance, readability, and the task at hand.

Combining Results with UNION and UNION ALL in SQL

0
sql course

Table of Contents

  1. Introduction
  2. What is UNION in SQL?
  3. When to Use UNION
  4. Basic Syntax of UNION
  5. What is UNION ALL?
  6. Difference Between UNION and UNION ALL
  7. Requirements for Using UNION
  8. Matching Columns and Data Types
  9. Using Aliases in UNION Queries
  10. UNION with Literal Data
  11. Ordering the Final Output
  12. Filtering Results in UNION
  13. UNION with Different Table Sources
  14. Real-World Example: Employees and Contractors
  15. Performance Comparison: UNION vs UNION ALL
  16. Using DISTINCT with UNION ALL
  17. Combining UNION with Joins or Subqueries
  18. Common Mistakes with UNION
  19. Best Practices for UNION Usage
  20. Summary and What’s Next

1. Introduction

In SQL, there are situations where you need to combine results from multiple queries. Whether it’s data coming from different tables or different sources, UNION and UNION ALL help you stitch them together into a single result set.


2. What is UNION in SQL?

UNION is a SQL operator used to combine the result sets of two or more SELECT statements, removing duplicate rows.


3. When to Use UNION

  • To merge results from multiple tables with similar structure
  • To show combined data (e.g., full employee list from different departments)
  • To perform reporting or data consolidation tasks
  • To avoid multiple queries in application code

4. Basic Syntax of UNION

sqlCopyEditSELECT column1, column2 FROM table1
UNION
SELECT column1, column2 FROM table2;

5. What is UNION ALL?

UNION ALL works like UNION but does not eliminate duplicate rows. It returns all records, including duplicates.

sqlCopyEditSELECT name FROM employees
UNION ALL
SELECT name FROM contractors;

6. Difference Between UNION and UNION ALL

FeatureUNIONUNION ALL
Duplicate removalYesNo
PerformanceSlower (due to sorting)Faster
Use caseWhen unique values neededWhen preserving duplicates

7. Requirements for Using UNION

  • Number of columns in all SELECT statements must match
  • Data types must be compatible (e.g., INT with INT, VARCHAR with VARCHAR)
  • Order of columns matters

8. Matching Columns and Data Types

Valid Example:

sqlCopyEditSELECT id, name FROM employees
UNION
SELECT id, name FROM contractors;

Invalid Example (throws error):

sqlCopyEditSELECT id, name FROM employees
UNION
SELECT salary, hire_date FROM contractors;

Column meanings must align across queries.


9. Using Aliases in UNION Queries

Only the first SELECT query’s aliases are retained in the final result:

sqlCopyEditSELECT name AS full_name FROM employees
UNION
SELECT name FROM contractors;

Result column will be full_name.


10. UNION with Literal Data

sqlCopyEditSELECT 'employee' AS type, name FROM employees
UNION
SELECT 'contractor', name FROM contractors;

Adds a column to indicate data origin.


11. Ordering the Final Output

You must use ORDER BY after the last SELECT:

sqlCopyEditSELECT name FROM employees
UNION
SELECT name FROM contractors
ORDER BY name;

You cannot use ORDER BY inside each individual SELECT.


12. Filtering Results in UNION

Use WHERE clauses in individual queries:

sqlCopyEditSELECT name FROM employees WHERE status = 'active'
UNION
SELECT name FROM contractors WHERE status = 'active';

Filters are applied before combining results.


13. UNION with Different Table Sources

Even if table structures differ, you can align columns using SELECT:

sqlCopyEditSELECT id, name, NULL AS rate FROM employees
UNION
SELECT id, NULL, hourly_rate FROM freelancers;

Use NULL or constants to fill missing values.


14. Real-World Example: Employees and Contractors

sqlCopyEditSELECT id, name, 'Employee' AS role FROM employees
UNION
SELECT id, name, 'Contractor' FROM contractors;

Combines people from both tables and labels their source.


15. Performance Comparison: UNION vs UNION ALL

OperatorSpeedUse When
UNIONSlowerYou need unique rows
UNION ALLFasterDuplicates are acceptable or desired

16. Using DISTINCT with UNION ALL

You can simulate UNION using DISTINCT:

sqlCopyEditSELECT DISTINCT * FROM (
  SELECT name FROM table1
  UNION ALL
  SELECT name FROM table2
) AS combined;

Sometimes useful when applying DISTINCT after a transformation.


17. Combining UNION with Joins or Subqueries

sqlCopyEditSELECT u.id, u.name FROM users u
JOIN orders o ON u.id = o.user_id
UNION
SELECT id, name FROM archived_users;

You can UNION a join result with another query as long as the columns align.


18. Common Mistakes with UNION

MistakeExplanation
Mismatched columnsNumber or type of columns must match
Using ORDER BY inside both queriesUse it only at the end
Expecting both aliases to persistOnly first SELECT defines column labels

19. Best Practices for UNION Usage

  • Use UNION ALL when performance matters and duplicates don’t
  • Always align column order and data types
  • Use column aliases in the first SELECT
  • Comment each SELECT for readability if complex
  • Use subqueries to preprocess before UNION if needed

20. Summary and What’s Next

UNION and UNION ALL are powerful SQL tools for combining multiple datasets into a single result set. Use UNION for uniqueness and UNION ALL for speed and completeness. Mastering these helps you write efficient and expressive SQL for reporting, analytics, and data integration.