Working with NoSQL-like Features in SQL: Arrays, HSTORE, and Semi-Structured Models


Table of Contents

  1. Introduction
  2. Why SQL Supports NoSQL-Like Structures
  3. Arrays in SQL: Definition and Use Cases
  4. Creating and Querying Array Columns
  5. Common Array Functions and Operators
  6. Unnesting Arrays with unnest()
  7. Searching Inside Arrays
  8. Aggregating and Constructing Arrays
  9. The HSTORE Data Type: Key-Value Storage in SQL
  10. Creating HSTORE Columns and Inserting Data
  11. Querying HSTORE Fields
  12. Updating and Deleting Keys in HSTORE
  13. Indexing Arrays and HSTORE Columns
  14. JSON vs HSTORE vs Arrays: When to Use What
  15. Combining Structured and Semi-Structured Data
  16. Using Arrays and HSTORE in Joins and Subqueries
  17. Real-World Example: Product Tags and Attributes
  18. Best Practices and Anti-Patterns
  19. Performance Considerations
  20. Summary and What’s Next

1. Introduction

Traditional relational databases like PostgreSQL are increasingly used for semi-structured and schema-flexible workloads. Features like ARRAY, HSTORE, and JSON/JSONB allow SQL to behave similarly to NoSQL, while preserving the power of relational logic.


2. Why SQL Supports NoSQL-Like Structures

  • Reduce over-normalization for simple list/map data
  • Flexibly store dynamic or sparse attributes
  • Avoid extra lookup tables for small key-value structures
  • Improve readability and performance for embedded fields

3. Arrays in SQL: Definition and Use Cases

An array is a data type that can store multiple values in a single column.

Use Cases:

  • Tags (['tech', 'finance'])
  • Skills (['SQL', 'Python'])
  • Multi-category products
  • Roles/permissions

4. Creating and Querying Array Columns

CREATE TABLE users (
id SERIAL PRIMARY KEY,
name TEXT,
interests TEXT[]
);

Inserting data:

INSERT INTO users (name, interests)
VALUES ('Anay', ARRAY['reading', 'coding']);

5. Common Array Functions and Operators

Function/OperatorDescription
= ANY(array)Checks if value is in array
array_length(array, 1)Returns length of array
array_append(array, val)Adds an element
array_remove(array, val)Removes an element
array_position()Finds index of a value

Example:

SELECT * FROM users WHERE 'coding' = ANY(interests);

6. Unnesting Arrays with unnest()

Convert array to rows:

SELECT id, unnest(interests) AS interest
FROM users;

This is useful for filtering, grouping, and aggregating arrays.


7. Searching Inside Arrays

SELECT name FROM users WHERE interests @> ARRAY['reading'];

The @> operator checks if the array contains all values of the provided array.


8. Aggregating and Constructing Arrays

Build arrays from rows:

SELECT user_id, ARRAY_AGG(skill ORDER BY skill) AS skill_set
FROM user_skills
GROUP BY user_id;

Remove duplicates:

SELECT ARRAY(SELECT DISTINCT unnest(interests)) FROM users;

9. The HSTORE Data Type: Key-Value Storage in SQL

HSTORE stores sets of key-value pairs within a single column.

Example:

CREATE TABLE products (
id SERIAL PRIMARY KEY,
name TEXT,
attributes HSTORE
);

10. Creating HSTORE Columns and Inserting Data

Enable extension (PostgreSQL):

CREATE EXTENSION IF NOT EXISTS hstore;

Insert data:

INSERT INTO products (name, attributes)
VALUES ('Laptop', 'brand => Dell, cpu => i7, ram => 16GB');

11. Querying HSTORE Fields

SELECT name FROM products WHERE attributes -> 'brand' = 'Dell';

Use -> to get value, ? to check for existence:

SELECT * FROM products WHERE attributes ? 'cpu';

12. Updating and Deleting Keys in HSTORE

Add a key:

UPDATE products
SET attributes = attributes || 'gpu => RTX 3060'
WHERE name = 'Laptop';

Remove a key:

UPDATE products
SET attributes = delete(attributes, 'ram')
WHERE name = 'Laptop';

13. Indexing Arrays and HSTORE Columns

PostgreSQL supports GIN indexes for both:

CREATE INDEX idx_interests ON users USING GIN (interests);
CREATE INDEX idx_attributes ON products USING GIN (attributes);

This dramatically speeds up @> and key existence queries.


14. JSON vs HSTORE vs Arrays: When to Use What

TypeUse CaseProsCons
ARRAYSimple lists, same typeLightweight, fast indexingLimited structure
HSTOREFlat key-value pairs (strings only)Simple, compactNo nesting or arrays
JSONBNested, structured dataFlexible, full JSON supportHeavier, more verbose

15. Combining Structured and Semi-Structured Data

CREATE TABLE events (
id SERIAL,
user_id INT,
event_type TEXT,
metadata JSONB
);

Example usage:

  • metadata: stores page URL, campaign ID, device info
  • event_type: keeps structure queryable

You can combine array, hstore, and json in one schema depending on requirements.


16. Using Arrays and HSTORE in Joins and Subqueries

Join user skills stored as arrays:

SELECT u.name, s.skill
FROM users u, unnest(u.interests) AS s(skill)
JOIN skill_details sd ON sd.name = s.skill;

Join HSTORE values by casting or using -> as virtual columns.


17. Real-World Example: Product Tags and Attributes

CREATE TABLE catalog (
id SERIAL,
product_name TEXT,
tags TEXT[],
specs HSTORE
);

Use case:

  • tags: quick filtering by category
  • specs: flexible storage of size, color, model, etc.

Query:

SELECT * FROM catalog
WHERE 'electronics' = ANY(tags) AND specs -> 'color' = 'black';

18. Best Practices and Anti-Patterns

✅ Use arrays for short, homogeneous data
✅ Use hstore when keys vary but are flat
✅ Use JSON for nested, structured data
✅ Always index if filtering is frequent
✅ Validate data with CHECK constraints or application logic

x Avoid storing unrelated data in the same JSON or hstore
x Avoid deeply nested arrays (hard to query and maintain)


19. Performance Considerations

  • Arrays and HSTORE are faster than JSON for flat structures
  • Avoid large arrays in rows — may hit row size limits
  • Use GIN indexes but be mindful of write overhead
  • Normalize heavily queried data if necessary
  • Combine with materialized views for reporting workloads

20. Summary and What’s Next

PostgreSQL’s support for arrays, HSTORE, and other semi-structured types provides SQL with powerful NoSQL-like flexibility — without giving up relational consistency or query optimization. These features let you handle real-world, variable data models efficiently.