Table of Contents
- Introduction
- What is
GROUP BY
in SQL? - Basic Syntax of
GROUP BY
- When to Use
GROUP BY
- Simple
GROUP BY
Example - Grouping by One Column
- Grouping by Multiple Columns
- Using Aggregate Functions with
GROUP BY
- Using
HAVING
to Filter Groups GROUP BY
vsWHERE
- Ordering Grouped Results
- Grouping NULL Values
- Real-World Example: Sales by Region
- Real-World Example: Employee Count by Department
- Grouping with Joins
- Common Errors with
GROUP BY
- Performance Considerations
- Nested Grouping and Subqueries
- Best Practices for
GROUP BY
- Summary and What’s Next
1. Introduction
The GROUP BY
clause in SQL is one of the most powerful tools for summarizing data. It allows you to organize rows into groups and apply aggregate functions such as COUNT()
, SUM()
, AVG()
, etc., on each group.
2. What is GROUP BY
in SQL?
GROUP BY
groups rows that have the same values in specified columns into summary rows. It’s often used in combination with aggregate functions to perform group-wise computations.
3. Basic Syntax of GROUP BY
sqlCopyEditSELECT column_name, AGGREGATE_FUNCTION(column_name)
FROM table_name
GROUP BY column_name;
Example:
sqlCopyEditSELECT department, COUNT(*)
FROM employees
GROUP BY department;
4. When to Use GROUP BY
Use GROUP BY
when:
- You want to aggregate values based on a specific column (e.g., department, category).
- You need summary statistics per group.
- You’re building reports or dashboards.
5. Simple GROUP BY
Example
sqlCopyEditSELECT category, SUM(price) AS total_sales
FROM products
GROUP BY category;
This shows total sales per product category.
6. Grouping by One Column
sqlCopyEditSELECT city, COUNT(*) AS user_count
FROM users
GROUP BY city;
Returns one row per city, showing how many users are from each.
7. Grouping by Multiple Columns
sqlCopyEditSELECT city, gender, COUNT(*)
FROM users
GROUP BY city, gender;
This groups first by city, then by gender within each city.
8. Using Aggregate Functions with GROUP BY
You can use:
COUNT()
SUM()
AVG()
MIN()
MAX()
Example:
sqlCopyEditSELECT department, AVG(salary) AS avg_salary
FROM employees
GROUP BY department;
9. Using HAVING
to Filter Groups
Use HAVING
to filter the grouped results:
sqlCopyEditSELECT department, COUNT(*) AS total
FROM employees
GROUP BY department
HAVING total > 5;
Unlike WHERE
, HAVING
works on aggregated data.
10. GROUP BY
vs WHERE
Clause | When It Works |
---|---|
WHERE | Before grouping (filters rows) |
HAVING | After grouping (filters groups) |
Example:
sqlCopyEditSELECT department, SUM(salary)
FROM employees
WHERE status = 'active'
GROUP BY department
HAVING SUM(salary) > 100000;
11. Ordering Grouped Results
sqlCopyEditSELECT department, COUNT(*) AS total
FROM employees
GROUP BY department
ORDER BY total DESC;
You can sort the output using ORDER BY
, even on aliases.
12. Grouping NULL Values
GROUP BY
treats NULL
as a valid group:
sqlCopyEditSELECT department, COUNT(*)
FROM employees
GROUP BY department;
Rows with NULL
in department
will be grouped together.
13. Real-World Example: Sales by Region
sqlCopyEditSELECT region, SUM(amount) AS total_sales
FROM sales
GROUP BY region;
Use GROUP BY
to break down KPIs by geography.
14. Real-World Example: Employee Count by Department
sqlCopyEditSELECT department, COUNT(*) AS employee_count
FROM employees
GROUP BY department
HAVING employee_count >= 10;
15. Grouping with Joins
sqlCopyEditSELECT c.country, COUNT(o.id) AS total_orders
FROM customers c
JOIN orders o ON c.id = o.customer_id
GROUP BY c.country;
This groups joined data — e.g., orders by country.
16. Common Errors with GROUP BY
- Using columns in
SELECT
that aren’t part ofGROUP BY
or aggregates - Using
WHERE
to filter aggregated values (should useHAVING
) - Forgetting to group on all non-aggregated columns
Example of incorrect usage:
sqlCopyEditSELECT department, name, COUNT(*)
FROM employees
GROUP BY department; -- ERROR: 'name' must be aggregated or grouped
17. Performance Considerations
- Use indexes on
GROUP BY
columns for faster grouping - Avoid grouping on columns with high cardinality unless necessary
- Apply
WHERE
filters before grouping to reduce the dataset size
18. Nested Grouping and Subqueries
sqlCopyEditSELECT region, AVG(total_sales)
FROM (
SELECT region, SUM(amount) AS total_sales
FROM sales
GROUP BY region, month
) AS monthly_sales
GROUP BY region;
This calculates average monthly sales by region.
19. Best Practices for GROUP BY
- Use aliases for clarity (
AS
) - Filter with
WHERE
before grouping - Use
HAVING
only when working with aggregates - Combine with
ORDER BY
to make reports readable - Always include only grouped or aggregated columns in
SELECT
20. Summary and What’s Next
The GROUP BY
clause is your gateway to powerful analytical queries in SQL. Whether you’re generating category-wise sales, user stats by country, or monthly trends, GROUP BY
helps convert rows into grouped summaries.