Data Modeling Examples in MongoDB

Data modeling in MongoDB is essential for efficient data storage, retrieval, and management. The goal is to structure the database in a way that minimizes performance bottlenecks, data duplication, and operational complexity. Below, we will explore three different types of applications—Blog, E-Commerce, and Chat App—and look at how to model data for each in MongoDB, highlighting the use of embedded and referenced documents.


1. Blog Application Data Model

A Blog application typically involves users, posts, and comments. Depending on the size of the data, relationships can be either embedded or referenced.

Entities:

  • User (Author of Posts)
  • Post (Blog post)
  • Comment (Comments on Posts)

Data Modeling Approach:

  • Posts can be embedded within a user document because a user typically creates many posts. It may be better to store the posts directly in the user’s document for easy retrieval when displaying the user’s profile.
  • Comments can either be embedded within the post or stored in a separate collection. If comments are small and not expected to grow rapidly, they can be embedded. If comments are expected to be large, referencing would be better.

Example Schema:

jsonCopyEdit// User Collection
{
  "_id": ObjectId("1a2b3c"),
  "username": "john_doe",
  "email": "[email protected]",
  "posts": [
    {
      "post_id": ObjectId("a1b2c3"),
      "title": "MongoDB for Beginners",
      "content": "This is a beginner’s guide to MongoDB...",
      "date_created": ISODate("2025-04-24"),
      "comments": [
        {
          "comment_id": ObjectId("x1y2z3"),
          "user_id": ObjectId("4a5b6c"),
          "comment_text": "Great post! Very helpful.",
          "date_created": ISODate("2025-04-25")
        }
      ]
    }
  ]
}

// Comment Collection (if using referencing)
{
  "_id": ObjectId("x1y2z3"),
  "post_id": ObjectId("a1b2c3"),
  "user_id": ObjectId("4a5b6c"),
  "comment_text": "Great post! Very helpful.",
  "date_created": ISODate("2025-04-25")
}

Considerations:

  • If Posts and Comments grow large (e.g., a post with thousands of comments), it might be better to store comments in a separate collection to avoid hitting the document size limit.
  • For performance, Posts are embedded within the User collection, but Comments are better off referenced separately.

2. E-Commerce Application Data Model

An E-Commerce application often involves products, users (customers), orders, and payments. These entities can have one-to-many or many-to-many relationships.

Entities:

  • User (Customer)
  • Product
  • Order
  • Payment

Data Modeling Approach:

  • Orders will reference both Users and Products. Since a user can have multiple orders and an order can include multiple products, referencing is used for these relationships.
  • Products and Users will be stored in separate collections. However, some small product-related information (like price) can be embedded in the Order document to reduce the need for querying the Products collection.

Example Schema:

jsonCopyEdit// User Collection
{
  "_id": ObjectId("user1"),
  "name": "Alice",
  "email": "[email protected]",
  "orders": [
    {
      "order_id": ObjectId("order1"),
      "date": ISODate("2025-04-24"),
      "total_amount": 150.0,
      "products": [
        {
          "product_id": ObjectId("prod1"),
          "quantity": 2,
          "price": 50.0
        },
        {
          "product_id": ObjectId("prod2"),
          "quantity": 1,
          "price": 50.0
        }
      ],
      "status": "Shipped"
    }
  ]
}

// Product Collection
{
  "_id": ObjectId("prod1"),
  "name": "Laptop",
  "category": "Electronics",
  "price": 500.0,
  "stock_quantity": 100
}

// Order Collection (referencing product)
{
  "_id": ObjectId("order1"),
  "user_id": ObjectId("user1"),
  "order_date": ISODate("2025-04-24"),
  "total_amount": 150.0,
  "payment_status": "Paid",
  "status": "Shipped"
}

Considerations:

  • Orders are referenced in the User collection, as a user can have many orders.
  • Products are referenced in the Order collection to avoid duplication of product data in every order.
  • You might embed product data in orders if products do not change often. If product details such as price and description change frequently, referencing is better to ensure data consistency.

3. Chat Application Data Model

In a Chat Application, the entities typically involve Users, Messages, and Chats. Chats might have many users, and messages are exchanged between these users.

Entities:

  • User
  • Message
  • Chat (e.g., group chat or direct message thread)

Data Modeling Approach:

  • Messages are often stored as embedded documents within Chat documents. However, if the messages are expected to grow rapidly (e.g., high-volume chat apps), referencing messages in a separate collection may be more efficient.
  • Chats may contain references to Users, where each chat can have multiple participants (users). Each message in a chat could reference the User who sent it.

Example Schema:

jsonCopyEdit// User Collection
{
  "_id": ObjectId("user1"),
  "username": "john_doe",
  "email": "[email protected]",
  "chats": [
    {
      "chat_id": ObjectId("chat1"),
      "participants": [
        ObjectId("user1"),
        ObjectId("user2")
      ],
      "messages": [
        {
          "message_id": ObjectId("msg1"),
          "user_id": ObjectId("user1"),
          "text": "Hello!",
          "timestamp": ISODate("2025-04-24T10:00:00Z")
        },
        {
          "message_id": ObjectId("msg2"),
          "user_id": ObjectId("user2"),
          "text": "Hi there!",
          "timestamp": ISODate("2025-04-24T10:01:00Z")
        }
      ]
    }
  ]
}

// Message Collection (if messages are referenced)
{
  "_id": ObjectId("msg1"),
  "chat_id": ObjectId("chat1"),
  "user_id": ObjectId("user1"),
  "text": "Hello!",
  "timestamp": ISODate("2025-04-24T10:00:00Z")
}

Considerations:

  • Chats can store references to Users and Messages. Embedding Messages inside Chats might be a good option if messages are small and chat history is limited.
  • If messages grow significantly, it’s better to use referencing for Messages and store them in a separate collection to manage the size of each Chat document.
  • Participants in a chat are stored as references to the User documents, as one chat may involve multiple users.

Conclusion

Data modeling in MongoDB requires careful thought about the types of relationships between your entities and how your application will interact with the data. Here’s a quick summary:

  • Embedded documents are ideal when data is often queried together, and updates need to be atomic.
  • Referenced documents are ideal for managing large datasets, reducing data duplication, and maintaining scalability.
  • For Blog apps, embedding posts and comments can be effective if the amount of data is small.
  • In an E-Commerce app, references for Orders and Products ensure flexibility and scalability as product data changes.
  • A Chat app might use embedding for small message histories but may switch to referencing for larger datasets.

By selecting the appropriate model (embedded or referenced), you can optimize your MongoDB database for performance and scalability, tailored to your application’s needs.