Aggregation Stages in MongoDB: $match, $project, $group, $sort, and $limit


Table of Contents

  1. Introduction to Aggregation Stages
  2. $match Stage – Filtering Documents
  3. $project Stage – Reshaping Documents
  4. $group Stage – Grouping and Aggregating
  5. $sort Stage – Ordering the Output
  6. $limit Stage – Reducing the Output Size
  7. Combining Stages in a Real-World Example
  8. Conclusion

Introduction to Aggregation Stages

MongoDB’s Aggregation Pipeline consists of multiple stages, where each stage processes input documents and passes the result to the next stage. These stages allow for powerful transformations and computations directly within the database.

Five foundational stages in most aggregation pipelines are:

  • $match: Filter documents.
  • $project: Include, exclude, or transform fields.
  • $group: Aggregate data.
  • $sort: Order results.
  • $limit: Restrict the number of results.

Let’s break down each one.


$match Stage – Filtering Documents

The $match stage acts as a filter, similar to the WHERE clause in SQL. It passes only those documents that match the specified criteria.

Syntax:

javascriptCopyEdit{ $match: { field: value } }

Example:

javascriptCopyEditdb.orders.aggregate([
  { $match: { status: "shipped" } }
])

This filters documents where status is "shipped".

Pro Tip: Place $match as early as possible in the pipeline to minimize the number of documents passed to later stages. This improves performance.


$project Stage – Reshaping Documents

The $project stage is used to include, exclude, or transform fields in the result set. It’s often used to:

  • Rename fields.
  • Create new computed fields.
  • Hide sensitive or unnecessary data.

Syntax:

javascriptCopyEdit{ $project: { field1: 1, field2: 1, _id: 0 } }

Example:

javascriptCopyEditdb.orders.aggregate([
  { $project: { customerId: 1, amount: 1, _id: 0 } }
])

This outputs only customerId and amount, excluding _id.

Transform fields example:

javascriptCopyEdit{ $project: { fullName: { $concat: ["$firstName", " ", "$lastName"] } } }

$group Stage – Grouping and Aggregating

The $group stage is one of the most powerful stages in the pipeline. It’s used to group documents by a specified identifier and then apply aggregation operators such as:

  • $sum
  • $avg
  • $min / $max
  • $first / $last
  • $push / $addToSet

Syntax:

javascriptCopyEdit{ $group: { _id: "$field", total: { $sum: "$amount" } } }

Example:

javascriptCopyEditdb.orders.aggregate([
  { $group: { _id: "$customerId", totalSpent: { $sum: "$amount" } } }
])

Groups orders by customerId and calculates total amount spent.

Grouping by a constant:

javascriptCopyEdit{ $group: { _id: null, totalRevenue: { $sum: "$amount" } } }

This aggregates across all documents.


$sort Stage – Ordering the Output

The $sort stage sorts documents based on specified fields.

Syntax:

javascriptCopyEdit{ $sort: { field: 1 } }   // Ascending
{ $sort: { field: -1 } }  // Descending

Example:

javascriptCopyEditdb.orders.aggregate([
  { $sort: { amount: -1 } }
])

Sorts orders by amount in descending order.

Important: $sort can be resource-intensive. Ensure you use indexes when sorting on large collections.


$limit Stage – Reducing the Output Size

The $limit stage restricts the number of documents passed to the next stage or returned to the client.

Syntax:

javascriptCopyEdit{ $limit: number }

Example:

javascriptCopyEditdb.orders.aggregate([
  { $sort: { amount: -1 } },
  { $limit: 5 }
])

Returns the top 5 orders with the highest amount.

This stage is commonly used for pagination or leaderboards.


Combining Stages in a Real-World Example

Let’s imagine a sales dashboard where we need to display the top 3 customers by total purchase amount:

javascriptCopyEditdb.orders.aggregate([
  { $match: { status: "completed" } },
  { $group: { _id: "$customerId", total: { $sum: "$amount" } } },
  { $sort: { total: -1 } },
  { $limit: 3 },
  { $project: { _id: 0, customerId: "$_id", total: 1 } }
])

Explanation:

  1. Filter only completed orders.
  2. Group by customer and calculate total.
  3. Sort totals in descending order.
  4. Limit to top 3 customers.
  5. Reshape the final output.

Conclusion

The aggregation pipeline stages $match, $project, $group, $sort, and $limit form the backbone of most real-world MongoDB aggregation operations. When used together, they allow you to filter, transform, group, and summarize data efficiently.