Introduction
In this module, we will explore the core components of MongoDB’s architecture. MongoDB is a NoSQL database that is designed to store, process, and retrieve data efficiently. It uses a document-oriented storage model, which differs significantly from the table-based approach in relational databases.
By the end of this module, you will understand:
- The structure and organization of Collections and Documents.
- The BSON format used to store data in MongoDB.
- How MongoDB organizes and accesses data efficiently.
1. Collections in MongoDB
In MongoDB, a Collection is the equivalent of a table in a relational database. Collections are used to store documents and are part of a Database.
- A Database in MongoDB can contain multiple collections, but collections within a database do not have a strict schema.
- Collections are designed to be flexible, allowing you to store different types of documents within the same collection.
Key Characteristics of Collections:
- Collections do not enforce data integrity (no foreign keys, joins, etc.).
- Collections are schema-less, meaning documents can have different fields, types, and structures.
- Collections can store data for various types of applications, such as e-commerce, social media, or logging systems.
Example:
Consider an e-commerce platform storing products and orders.
jsonCopyEdit{
"products": [
{ "name": "Laptop", "price": 999, "category": "Electronics" },
{ "name": "Shoes", "price": 49, "category": "Footwear" }
],
"orders": [
{ "order_id": "123", "customer_id": "456", "status": "Shipped" },
{ "order_id": "124", "customer_id": "457", "status": "Pending" }
]
}
In the above example, products and orders could each be represented as a separate collection in MongoDB.
2. Documents in MongoDB
In MongoDB, Documents are the fundamental unit of data storage. They are analogous to rows in relational databases. However, unlike rows in relational tables, MongoDB documents are flexible and can have different fields and structures within the same collection.
Key Characteristics of Documents:
- Each document is represented as a JSON-like structure, typically using BSON format (Binary JSON).
- Each document has a unique identifier called _id, which is automatically generated by MongoDB unless you specify one.
- Documents in a collection can have different fields, meaning they don’t have to follow a rigid schema.
Structure of a MongoDB Document:
A MongoDB document is a set of key-value pairs (fields and values). The key represents the field name, and the value represents the field’s value. Here’s an example of a document in a products collection:
jsonCopyEdit{
"_id": "12345",
"name": "Laptop",
"price": 999,
"category": "Electronics",
"in_stock": true,
"reviews": [
{ "user": "John", "rating": 5, "comment": "Excellent product!" },
{ "user": "Sara", "rating": 4, "comment": "Good value for money." }
]
}
- _id: Unique identifier for the document. By default, MongoDB generates an ObjectId if not provided.
- name, price, category: Fields containing data about the product.
- reviews: An array containing embedded documents that represent customer reviews.
3. BSON: Binary JSON
MongoDB stores its data in a binary-encoded format called BSON (Binary JSON). BSON is similar to JSON but has some differences and optimizations for performance and storage efficiency.
Key Characteristics of BSON:
- Binary format: BSON is a binary representation of JSON-like documents. While JSON is a text-based format, BSON is optimized for speed, space, and portability.
- Supports additional data types: BSON supports more data types than JSON, such as Date, Binary data, ObjectId, and others.
- Size: BSON is generally more compact than JSON and is specifically designed for efficient storage and retrieval.
Common BSON Data Types:
- String: UTF-8 string (e.g.,
"name": "Laptop"
) - Integer: 32-bit integer (e.g.,
"age": 30
) - Long: 64-bit integer (e.g.,
"timestamp": 1622125323000
) - Double: 64-bit floating point number (e.g.,
"price": 999.99
) - Boolean: True/False value (e.g.,
"in_stock": true
) - Date: Date and time (e.g.,
"created_at": ISODate("2023-07-01T10:00:00Z")
) - ObjectId: Unique identifier used by MongoDB (e.g.,
"id": ObjectId("507f191e810c19729de860ea")
) - Array: Arrays of values (e.g.,
"reviews": [ ... ]
) - Embedded Document: A document inside another document (e.g.,
"reviews": { "user": "John", "rating": 5 }
)
Example of BSON Document:
BSON documents appear similar to JSON, but under the hood, they are binary-encoded for performance.
jsonCopyEdit{
"_id": ObjectId("507f191e810c19729de860ea"),
"name": "Laptop",
"price": 999.99,
"category": "Electronics",
"in_stock": true,
"created_at": ISODate("2023-07-01T10:00:00Z")
}
This document would be stored in BSON format on disk, providing a more efficient structure than plain JSON.
4. How Data is Stored in MongoDB
- Storage Engine: MongoDB uses a storage engine to manage how data is stored and retrieved. The default storage engine is WiredTiger, which is designed for high performance and concurrent operations.
- Indexes: MongoDB allows you to create indexes on specific fields to improve query performance. By default, MongoDB creates an index on the _id field.
How MongoDB Uses Collections and Documents:
- When you insert data, MongoDB stores the document in a collection.
- The data is stored in BSON format, making it easier to read and write.
- MongoDB organizes the documents in collections, making it simple to query, update, and delete documents.
Conclusion
In this module, we’ve covered the core architectural components of MongoDB:
- Collections: The container for storing documents.
- Documents: The data units that are stored in collections.
- BSON: The binary format used to store documents efficiently in MongoDB.
With this foundational understanding, you are ready to dive deeper into how to interact with MongoDB to perform CRUD operations, implement queries, and work with more advanced features.