Table of Contents
- Introduction
- What Is Time-Series Data?
- Common Use Cases for Time-Based Analysis
- SQL Data Types for Timestamps
- Formatting and Parsing Date/Time Values
- Filtering Data by Time Range
- Truncating Time Granularity (
DATE_TRUNC
,TO_CHAR
) - Grouping by Day, Week, Month
- Calculating Running Totals and Rolling Averages
- Using
LAG()
andLEAD()
for Time Comparisons - Time Gap Detection Between Events
- Handling Missing Time Intervals
- Binning Time into Fixed Buckets
- Time-Based Joins (e.g., Point-in-Time Snapshots)
- Creating Time Windows with
RANGE
orROWS
- Using
INTERVAL
for Dynamic Time Offsets - Indexing Strategies for Time-Series Data
- Real-World Use Case: User Logins Over Time
- Best Practices for Time-Series SQL
- Summary and What’s Next
1. Introduction
Time-series data represents chronological records of events or measurements. SQL provides powerful tools to query, transform, and analyze such data for dashboards, reporting, and forecasting.
2. What Is Time-Series Data?
Time-series data is a sequence of values recorded over time. Examples include:
- Sensor readings (IoT)
- Stock prices
- Web traffic logs
- Temperature records
- Financial transactions
Each row typically contains a timestamp and a measured value.
3. Common Use Cases for Time-Based Analysis
- Trends over time
- Month-over-month comparisons
- User activity heatmaps
- Downtime detection
- Forecasting and seasonal analysis
4. SQL Data Types for Timestamps
Data Type | Description |
---|---|
DATE | Year, month, day (no time) |
TIME | Time only (no date) |
TIMESTAMP | Date and time (with or without zone) |
DATETIME | MySQL-specific full timestamp |
TIMESTAMPTZ | PostgreSQL with time zone |
5. Formatting and Parsing Date/Time Values
PostgreSQL:
sqlCopyEditTO_CHAR(timestamp, 'YYYY-MM-DD')
MySQL:
sqlCopyEditDATE_FORMAT(timestamp, '%Y-%m-%d')
Useful for human-readable outputs or time grouping.
6. Filtering Data by Time Range
sqlCopyEditSELECT * FROM logins
WHERE login_time BETWEEN '2024-01-01' AND '2024-01-31';
Use >=
and <
instead of BETWEEN
for precise range control.
7. Truncating Time Granularity
PostgreSQL:
sqlCopyEditSELECT DATE_TRUNC('day', login_time) AS day, COUNT(*)
FROM logins
GROUP BY 1;
MySQL:
sqlCopyEditSELECT DATE(login_time) AS day, COUNT(*)
FROM logins
GROUP BY day;
8. Grouping by Day, Week, Month
sqlCopyEdit-- Monthly active users
SELECT DATE_TRUNC('month', signup_date) AS month, COUNT(*)
FROM users
GROUP BY month
ORDER BY month;
Use WEEK
, MONTH
, HOUR
, etc. based on granularity needed.
9. Calculating Running Totals and Rolling Averages
sqlCopyEditSELECT date,
SUM(sales) OVER (ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS running_total,
AVG(sales) OVER (ORDER BY date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) AS rolling_7day_avg
FROM sales_data;
Window functions make rolling aggregates simple and powerful.
10. Using LAG()
and LEAD()
for Time Comparisons
sqlCopyEditSELECT date, sales,
LAG(sales) OVER (ORDER BY date) AS previous_day_sales,
sales - LAG(sales) OVER (ORDER BY date) AS daily_change
FROM sales_data;
Ideal for day-over-day or period-over-period analysis.
11. Time Gap Detection Between Events
sqlCopyEditSELECT user_id, activity_time,
LAG(activity_time) OVER (PARTITION BY user_id ORDER BY activity_time) AS previous_time,
activity_time - LAG(activity_time) OVER (...) AS time_diff
FROM activity_logs;
Detects inactivity gaps, session timeouts, etc.
12. Handling Missing Time Intervals
Use a calendar or date series table to join:
sqlCopyEdit-- Generate days and LEFT JOIN actual data to fill gaps
SELECT d.day, COUNT(logins.id)
FROM generate_series('2024-01-01'::date, '2024-01-31', interval '1 day') d(day)
LEFT JOIN logins ON DATE(logins.login_time) = d.day
GROUP BY d.day;
13. Binning Time into Fixed Buckets
sqlCopyEditSELECT FLOOR(EXTRACT(EPOCH FROM event_time) / 300) * 300 AS bucket,
COUNT(*)
FROM events
GROUP BY bucket;
This groups data into 5-minute buckets (300s).
14. Time-Based Joins (e.g., Point-in-Time Snapshots)
sqlCopyEditSELECT p.id, p.name, r.region
FROM people p
JOIN regions r ON p.snapshot_time BETWEEN r.start_time AND r.end_time;
Useful for tracking temporal dimensions like active contracts or locations.
15. Creating Time Windows with RANGE
or ROWS
sqlCopyEditSELECT date, SUM(sales) OVER (
ORDER BY date
RANGE BETWEEN INTERVAL '7 days' PRECEDING AND CURRENT ROW
) AS weekly_sales
FROM sales_data;
RANGE
is value-based, while ROWS
is row-count-based.
16. Using INTERVAL
for Dynamic Time Offsets
sqlCopyEditSELECT * FROM purchases
WHERE purchase_time >= NOW() - INTERVAL '30 days';
Dynamic filters for last X days/hours/weeks.
17. Indexing Strategies for Time-Series Data
- Use B-tree indexes on timestamp columns
- Composite indexes:
(timestamp, user_id)
- Consider partitioning large datasets by time range
- Use descending indexes for reverse chronological queries
18. Real-World Use Case: User Logins Over Time
sqlCopyEditSELECT DATE_TRUNC('week', login_time) AS week, COUNT(*) AS logins
FROM user_logins
WHERE login_time >= CURRENT_DATE - INTERVAL '3 months'
GROUP BY week
ORDER BY week;
Used for weekly user activity trends.
19. Best Practices for Time-Series SQL
- Store time in UTC
- Avoid redundant
DATE()
orCAST()
in filters — may prevent index use - Always index time fields in high-volume logs
- Use materialized views for pre-aggregated reports
- Benchmark with real volumes to test query plans
20. Summary and What’s Next
SQL offers powerful techniques for analyzing time-series data — from simple grouping to complex windowing functions. Mastering temporal queries equips you to build advanced analytics on user behavior, trends, forecasts, and system events.