Iterating Through Customers on a 12-Months-Rolling Basis
In this article, we will explore how to iterate through customers on a 12-months-rolling-basis and check if a customer has not ordered in the past 12 months. We’ll examine a few approaches to achieve this goal.
Introduction
To start, let’s define what it means to iterate through customers on a 12-months-rolling basis. This involves selecting each month of the year and checking if the last order from the customer was placed more than 12 months ago.
We’re given an example use case with a table customers
containing order dates and customer IDs.
Existing Query
First, let’s look at the existing query that achieves this for a single month. The query selects the date 12 months after each order date, joins it with the customers table to check if there are any orders in the next 12 months, and filters out the rows where the last order was made more than 12 months ago.
SELECT
(c1.order_date + interval '12 month')::date as churn_date,
c1.customer
FROM customers c1
WHERE c1.order_date BETWEEN '2020-03-01 00:00:00' AND '2020-03-31 23:59:59'
AND NOT EXISTS
(SELECT
c2.customer
FROM customers c2
WHERE c2.order_date BETWEEN '2020-04-01 00:00:00' AND '2021-03-31 23:59:59'
AND c2.customer = c1.customer)
ORDER BY 1,2;
Challenges with Existing Query
The existing query has a major flaw in that it requires running for each month separately. This means we have to write and execute a new query every time the year changes.
Iterating Through Months
To avoid this limitation, we need an approach that allows us to iterate through multiple months at once. In this article, we’ll explore two possible methods:
1. Using a Calendar Table
One way to achieve this is by creating a calendar table with all the dates in each month. We can then join this calendar table with our customers table and filter out the rows where the last order was made more than 12 months ago.
-- Create a calendar table for each month
CREATE TABLE monthly_calendar (
month DATE,
day INT
);
-- Insert data into the calendar table
INSERT INTO monthly_calendar (month, day)
VALUES
('2021-03-01', '31'),
('2021-04-01', '30');
-- Join with customers table and filter out rows where last order was made more than 12 months ago
SELECT
gc.month, c.customer, MAX(c.order_date + interval '12 month')::date as churn_date
FROM customers c CROSS JOIN
monthly_calendar gc
WHERE c.order_date BETWEEN gc.month AND gc.month + INTERVAL '1 month'
GROUP BY gc.month, c.customer
HAVING DATE_TRUNC('month', MAX(c.order_date)) = DATE_TRUNC('month', gc.month) - INTERVAL '12 month';
However, this approach has its own set of limitations and may not be efficient for large datasets.
2. Using Generated Series
Another way to achieve this is by using the GENERATE_SERIES
function, which allows us to generate a series of dates for each month.
-- Use generated series to iterate through months
SELECT
gs.month, c.customer, MAX(c.order_date + interval '12 month')::date as churn_date
FROM customers c CROSS JOIN
GENERATE_SERIES('2021-03-01'::date, '2021-04-01'::date, interval '1 month') gs(month)
WHERE c.order_date < gs.month + interval '1 month'
GROUP BY gs.month, c.customer
HAVING DATE_TRUNC('month', MAX(c.order_date)) = DATE_TRUNC('month', gs.month) - interval '12 month';
This approach is more efficient and can handle large datasets. However, it may not be as intuitive to read as the calendar table method.
Conclusion
In conclusion, we have explored two possible methods for iterating through customers on a 12-months-rolling basis: using a calendar table and using generated series. Both approaches have their own set of limitations and advantages. The choice of which approach to use depends on the specific requirements of your project and the characteristics of your dataset.
Advice
When working with dates and times in SQL, it’s essential to consider the following best practices:
- Always specify the date format when creating tables or inserting data.
- Use
DATE_TRUNC
instead ofMAX
when comparing dates. - Consider using calendar tables or generated series to simplify date-based queries.
By following these best practices, you can write more efficient and maintainable SQL code that takes advantage of the latest features and improvements in your database system.
Last modified on 2023-10-23