Table of Contents
- Introduction
- Why SQL Supports NoSQL-Like Structures
- Arrays in SQL: Definition and Use Cases
- Creating and Querying Array Columns
- Common Array Functions and Operators
- Unnesting Arrays with
unnest()
- Searching Inside Arrays
- Aggregating and Constructing Arrays
- The HSTORE Data Type: Key-Value Storage in SQL
- Creating HSTORE Columns and Inserting Data
- Querying HSTORE Fields
- Updating and Deleting Keys in HSTORE
- Indexing Arrays and HSTORE Columns
- JSON vs HSTORE vs Arrays: When to Use What
- Combining Structured and Semi-Structured Data
- Using Arrays and HSTORE in Joins and Subqueries
- Real-World Example: Product Tags and Attributes
- Best Practices and Anti-Patterns
- Performance Considerations
- Summary and What’s Next
1. Introduction
Traditional relational databases like PostgreSQL are increasingly used for semi-structured and schema-flexible workloads. Features like ARRAY, HSTORE, and JSON/JSONB allow SQL to behave similarly to NoSQL, while preserving the power of relational logic.
2. Why SQL Supports NoSQL-Like Structures
- Reduce over-normalization for simple list/map data
- Flexibly store dynamic or sparse attributes
- Avoid extra lookup tables for small key-value structures
- Improve readability and performance for embedded fields
3. Arrays in SQL: Definition and Use Cases
An array is a data type that can store multiple values in a single column.
Use Cases:
- Tags (
['tech', 'finance']
) - Skills (
['SQL', 'Python']
) - Multi-category products
- Roles/permissions
4. Creating and Querying Array Columns
CREATE TABLE users (
id SERIAL PRIMARY KEY,
name TEXT,
interests TEXT[]
);
Inserting data:
INSERT INTO users (name, interests)
VALUES ('Anay', ARRAY['reading', 'coding']);
5. Common Array Functions and Operators
Function/Operator | Description |
---|---|
= ANY(array) | Checks if value is in array |
array_length(array, 1) | Returns length of array |
array_append(array, val) | Adds an element |
array_remove(array, val) | Removes an element |
array_position() | Finds index of a value |
Example:
SELECT * FROM users WHERE 'coding' = ANY(interests);
6. Unnesting Arrays with unnest()
Convert array to rows:
SELECT id, unnest(interests) AS interest
FROM users;
This is useful for filtering, grouping, and aggregating arrays.
7. Searching Inside Arrays
SELECT name FROM users WHERE interests @> ARRAY['reading'];
The @>
operator checks if the array contains all values of the provided array.
8. Aggregating and Constructing Arrays
Build arrays from rows:
SELECT user_id, ARRAY_AGG(skill ORDER BY skill) AS skill_set
FROM user_skills
GROUP BY user_id;
Remove duplicates:
SELECT ARRAY(SELECT DISTINCT unnest(interests)) FROM users;
9. The HSTORE Data Type: Key-Value Storage in SQL
HSTORE
stores sets of key-value pairs within a single column.
Example:
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name TEXT,
attributes HSTORE
);
10. Creating HSTORE Columns and Inserting Data
Enable extension (PostgreSQL):
CREATE EXTENSION IF NOT EXISTS hstore;
Insert data:
INSERT INTO products (name, attributes)
VALUES ('Laptop', 'brand => Dell, cpu => i7, ram => 16GB');
11. Querying HSTORE Fields
SELECT name FROM products WHERE attributes -> 'brand' = 'Dell';
Use ->
to get value, ?
to check for existence:
SELECT * FROM products WHERE attributes ? 'cpu';
12. Updating and Deleting Keys in HSTORE
Add a key:
UPDATE products
SET attributes = attributes || 'gpu => RTX 3060'
WHERE name = 'Laptop';
Remove a key:
UPDATE products
SET attributes = delete(attributes, 'ram')
WHERE name = 'Laptop';
13. Indexing Arrays and HSTORE Columns
PostgreSQL supports GIN indexes for both:
CREATE INDEX idx_interests ON users USING GIN (interests);
CREATE INDEX idx_attributes ON products USING GIN (attributes);
This dramatically speeds up @>
and key existence queries.
14. JSON vs HSTORE vs Arrays: When to Use What
Type | Use Case | Pros | Cons |
---|---|---|---|
ARRAY | Simple lists, same type | Lightweight, fast indexing | Limited structure |
HSTORE | Flat key-value pairs (strings only) | Simple, compact | No nesting or arrays |
JSONB | Nested, structured data | Flexible, full JSON support | Heavier, more verbose |
15. Combining Structured and Semi-Structured Data
CREATE TABLE events (
id SERIAL,
user_id INT,
event_type TEXT,
metadata JSONB
);
Example usage:
metadata
: stores page URL, campaign ID, device infoevent_type
: keeps structure queryable
You can combine array, hstore, and json in one schema depending on requirements.
16. Using Arrays and HSTORE in Joins and Subqueries
Join user skills stored as arrays:
SELECT u.name, s.skill
FROM users u, unnest(u.interests) AS s(skill)
JOIN skill_details sd ON sd.name = s.skill;
Join HSTORE values by casting or using ->
as virtual columns.
17. Real-World Example: Product Tags and Attributes
CREATE TABLE catalog (
id SERIAL,
product_name TEXT,
tags TEXT[],
specs HSTORE
);
Use case:
tags
: quick filtering by categoryspecs
: flexible storage of size, color, model, etc.
Query:
SELECT * FROM catalog
WHERE 'electronics' = ANY(tags) AND specs -> 'color' = 'black';
18. Best Practices and Anti-Patterns
✅ Use arrays for short, homogeneous data
✅ Use hstore when keys vary but are flat
✅ Use JSON for nested, structured data
✅ Always index if filtering is frequent
✅ Validate data with CHECK constraints or application logic
x Avoid storing unrelated data in the same JSON or hstore
x Avoid deeply nested arrays (hard to query and maintain)
19. Performance Considerations
- Arrays and HSTORE are faster than JSON for flat structures
- Avoid large arrays in rows — may hit row size limits
- Use GIN indexes but be mindful of write overhead
- Normalize heavily queried data if necessary
- Combine with materialized views for reporting workloads
20. Summary and What’s Next
PostgreSQL’s support for arrays, HSTORE, and other semi-structured types provides SQL with powerful NoSQL-like flexibility — without giving up relational consistency or query optimization. These features let you handle real-world, variable data models efficiently.