Grouping Data with GROUP BY in SQL


Table of Contents

  1. Introduction
  2. What is GROUP BY in SQL?
  3. Basic Syntax of GROUP BY
  4. When to Use GROUP BY
  5. Simple GROUP BY Example
  6. Grouping by One Column
  7. Grouping by Multiple Columns
  8. Using Aggregate Functions with GROUP BY
  9. Using HAVING to Filter Groups
  10. GROUP BY vs WHERE
  11. Ordering Grouped Results
  12. Grouping NULL Values
  13. Real-World Example: Sales by Region
  14. Real-World Example: Employee Count by Department
  15. Grouping with Joins
  16. Common Errors with GROUP BY
  17. Performance Considerations
  18. Nested Grouping and Subqueries
  19. Best Practices for GROUP BY
  20. Summary and What’s Next

1. Introduction

The GROUP BY clause in SQL is one of the most powerful tools for summarizing data. It allows you to organize rows into groups and apply aggregate functions such as COUNT(), SUM(), AVG(), etc., on each group.


2. What is GROUP BY in SQL?

GROUP BY groups rows that have the same values in specified columns into summary rows. It’s often used in combination with aggregate functions to perform group-wise computations.


3. Basic Syntax of GROUP BY

sqlCopyEditSELECT column_name, AGGREGATE_FUNCTION(column_name)
FROM table_name
GROUP BY column_name;

Example:

sqlCopyEditSELECT department, COUNT(*) 
FROM employees
GROUP BY department;

4. When to Use GROUP BY

Use GROUP BY when:

  • You want to aggregate values based on a specific column (e.g., department, category).
  • You need summary statistics per group.
  • You’re building reports or dashboards.

5. Simple GROUP BY Example

sqlCopyEditSELECT category, SUM(price) AS total_sales
FROM products
GROUP BY category;

This shows total sales per product category.


6. Grouping by One Column

sqlCopyEditSELECT city, COUNT(*) AS user_count
FROM users
GROUP BY city;

Returns one row per city, showing how many users are from each.


7. Grouping by Multiple Columns

sqlCopyEditSELECT city, gender, COUNT(*) 
FROM users
GROUP BY city, gender;

This groups first by city, then by gender within each city.


8. Using Aggregate Functions with GROUP BY

You can use:

  • COUNT()
  • SUM()
  • AVG()
  • MIN()
  • MAX()

Example:

sqlCopyEditSELECT department, AVG(salary) AS avg_salary
FROM employees
GROUP BY department;

9. Using HAVING to Filter Groups

Use HAVING to filter the grouped results:

sqlCopyEditSELECT department, COUNT(*) AS total
FROM employees
GROUP BY department
HAVING total > 5;

Unlike WHERE, HAVING works on aggregated data.


10. GROUP BY vs WHERE

ClauseWhen It Works
WHEREBefore grouping (filters rows)
HAVINGAfter grouping (filters groups)

Example:

sqlCopyEditSELECT department, SUM(salary)
FROM employees
WHERE status = 'active'
GROUP BY department
HAVING SUM(salary) > 100000;

11. Ordering Grouped Results

sqlCopyEditSELECT department, COUNT(*) AS total
FROM employees
GROUP BY department
ORDER BY total DESC;

You can sort the output using ORDER BY, even on aliases.


12. Grouping NULL Values

GROUP BY treats NULL as a valid group:

sqlCopyEditSELECT department, COUNT(*) 
FROM employees
GROUP BY department;

Rows with NULL in department will be grouped together.


13. Real-World Example: Sales by Region

sqlCopyEditSELECT region, SUM(amount) AS total_sales
FROM sales
GROUP BY region;

Use GROUP BY to break down KPIs by geography.


14. Real-World Example: Employee Count by Department

sqlCopyEditSELECT department, COUNT(*) AS employee_count
FROM employees
GROUP BY department
HAVING employee_count >= 10;

15. Grouping with Joins

sqlCopyEditSELECT c.country, COUNT(o.id) AS total_orders
FROM customers c
JOIN orders o ON c.id = o.customer_id
GROUP BY c.country;

This groups joined data — e.g., orders by country.


16. Common Errors with GROUP BY

  • Using columns in SELECT that aren’t part of GROUP BY or aggregates
  • Using WHERE to filter aggregated values (should use HAVING)
  • Forgetting to group on all non-aggregated columns

Example of incorrect usage:

sqlCopyEditSELECT department, name, COUNT(*) 
FROM employees
GROUP BY department;  -- ERROR: 'name' must be aggregated or grouped

17. Performance Considerations

  • Use indexes on GROUP BY columns for faster grouping
  • Avoid grouping on columns with high cardinality unless necessary
  • Apply WHERE filters before grouping to reduce the dataset size

18. Nested Grouping and Subqueries

sqlCopyEditSELECT region, AVG(total_sales)
FROM (
  SELECT region, SUM(amount) AS total_sales
  FROM sales
  GROUP BY region, month
) AS monthly_sales
GROUP BY region;

This calculates average monthly sales by region.


19. Best Practices for GROUP BY

  • Use aliases for clarity (AS)
  • Filter with WHERE before grouping
  • Use HAVING only when working with aggregates
  • Combine with ORDER BY to make reports readable
  • Always include only grouped or aggregated columns in SELECT

20. Summary and What’s Next

The GROUP BY clause is your gateway to powerful analytical queries in SQL. Whether you’re generating category-wise sales, user stats by country, or monthly trends, GROUP BY helps convert rows into grouped summaries.