How to Categorize Timestamps into Time Groups Based on Intervals in BigQuery etd_admin, December 13, 2024December 13, 2024 When working with datasets in BigQuery, it is common to encounter timestamps that need to be categorized into specific time intervals. This process is crucial for aggregating data, analyzing patterns, or generating reports. In this article, we will explain how to categorize timestamps into time groups in BigQuery, using simple SQL queries with practical examples. Categorizing timestamps allows you to group events or data points into meaningful intervals, such as hourly, daily, weekly, or custom time ranges. This helps: Summarize large datasets into manageable chunks. Identify trends over specific periods. Simplify reporting and visualization tasks. Steps to Categorize Timestamps into Time Groups BigQuery provides powerful functions to handle timestamps and group them into intervals. The main functions we will use include: TIMESTAMP_TRUNC: Truncate timestamps to a specific part (e.g., hour, day, week). DATETIME_TRUNC: Similar to TIMESTAMP_TRUNC but works with DATETIME data type. TIMESTAMP_DIFF: Calculate the difference between timestamps. Below are examples to demonstrate how to categorize timestamps into time groups in BigQuery. Grouping Timestamps by Hour SELECT TIMESTAMP_TRUNC(event_time, HOUR) AS hour_group, COUNT(*) AS event_count FROM `project_id.dataset_id.event_logs` GROUP BY hour_group ORDER BY hour_group; This query uses TIMESTAMP_TRUNC to truncate the event_time column to the hour level. Each distinct hour is treated as a group, and the number of events in each group is counted. Custom Intervals (e.g., 15-Minute Intervals) SELECT TIMESTAMP_ADD(TIMESTAMP_TRUNC(event_time, HOUR), INTERVAL FLOOR(EXTRACT(MINUTE FROM event_time) / 15) * 15 MINUTE) AS time_group, COUNT(*) AS event_count FROM `project_id.dataset_id.event_logs` GROUP BY time_group ORDER BY time_group; In this query: TIMESTAMP_TRUNC truncates the timestamp to the nearest hour. EXTRACT(MINUTE FROM event_time) retrieves the minute part of the timestamp. FLOOR(... / 15) * 15 calculates the start of the 15-minute interval. TIMESTAMP_ADD adjusts the truncated timestamp to the calculated interval. Grouping by Days of the Week SELECT EXTRACT(DAYOFWEEK FROM event_time) AS day_of_week, COUNT(*) AS event_count FROM `project_id.dataset_id.event_logs` GROUP BY day_of_week ORDER BY day_of_week; Here, EXTRACT(DAYOFWEEK FROM event_time) assigns a numeric value (1 for Sunday, 7 for Saturday) to each day. Custom Interval Ranges (e.g., Shift Hours) SELECT CASE WHEN EXTRACT(HOUR FROM event_time) BETWEEN 6 AND 13 THEN 'Morning Shift' WHEN EXTRACT(HOUR FROM event_time) BETWEEN 14 AND 21 THEN 'Evening Shift' ELSE 'Night Shift' END AS shift_group, COUNT(*) AS event_count FROM `project_id.dataset_id.event_logs` GROUP BY shift_group ORDER BY shift_group; Categorizing timestamps into time groups in BigQuery is straightforward with the use of built-in functions like TIMESTAMP_TRUNC, EXTRACT, and TIMESTAMP_ADD. These functions allow you to create flexible time groupings that suit various analytical needs. Whether you need standard intervals (hours, days) or custom ranges, BigQuery’s SQL capabilities make it easy to handle. BigQuery SQL BigQuerySQLTimestamps