Iterating Through Customers on a 12-Months-Rolling Basis: Two Approaches to Simplify Your Queries

Iterating Through Customers on a 12-Months-Rolling Basis

In this article, we will explore how to iterate through customers on a 12-months-rolling-basis and check if a customer has not ordered in the past 12 months. We’ll examine a few approaches to achieve this goal.

Introduction

To start, let’s define what it means to iterate through customers on a 12-months-rolling basis. This involves selecting each month of the year and checking if the last order from the customer was placed more than 12 months ago.

We’re given an example use case with a table customers containing order dates and customer IDs.

Existing Query

First, let’s look at the existing query that achieves this for a single month. The query selects the date 12 months after each order date, joins it with the customers table to check if there are any orders in the next 12 months, and filters out the rows where the last order was made more than 12 months ago.

SELECT 
    (c1.order_date + interval '12 month')::date as churn_date,
    c1.customer
FROM customers c1
WHERE c1.order_date BETWEEN '2020-03-01 00:00:00' AND '2020-03-31 23:59:59'
    
AND NOT EXISTS
    
   (SELECT 
    c2.customer
    FROM customers c2
    WHERE c2.order_date BETWEEN '2020-04-01 00:00:00' AND '2021-03-31 23:59:59'
    AND c2.customer = c1.customer)

ORDER BY 1,2;

Challenges with Existing Query

The existing query has a major flaw in that it requires running for each month separately. This means we have to write and execute a new query every time the year changes.

Iterating Through Months

To avoid this limitation, we need an approach that allows us to iterate through multiple months at once. In this article, we’ll explore two possible methods:

1. Using a Calendar Table

One way to achieve this is by creating a calendar table with all the dates in each month. We can then join this calendar table with our customers table and filter out the rows where the last order was made more than 12 months ago.

-- Create a calendar table for each month
CREATE TABLE monthly_calendar (
    month DATE,
    day INT
);

-- Insert data into the calendar table
INSERT INTO monthly_calendar (month, day)
VALUES 
('2021-03-01', '31'),
('2021-04-01', '30');

-- Join with customers table and filter out rows where last order was made more than 12 months ago
SELECT 
    gc.month, c.customer, MAX(c.order_date + interval '12 month')::date as churn_date
FROM customers c CROSS JOIN
     monthly_calendar gc
WHERE c.order_date BETWEEN gc.month AND gc.month + INTERVAL '1 month'
GROUP BY gc.month, c.customer
HAVING DATE_TRUNC('month', MAX(c.order_date)) = DATE_TRUNC('month', gc.month) - INTERVAL '12 month';

However, this approach has its own set of limitations and may not be efficient for large datasets.

2. Using Generated Series

Another way to achieve this is by using the GENERATE_SERIES function, which allows us to generate a series of dates for each month.

-- Use generated series to iterate through months
SELECT 
    gs.month, c.customer, MAX(c.order_date + interval '12 month')::date as churn_date
FROM customers c CROSS JOIN
     GENERATE_SERIES('2021-03-01'::date, '2021-04-01'::date, interval '1 month') gs(month)
WHERE c.order_date < gs.month + interval '1 month'
GROUP BY gs.month, c.customer
HAVING DATE_TRUNC('month', MAX(c.order_date)) = DATE_TRUNC('month', gs.month) - interval '12 month';

This approach is more efficient and can handle large datasets. However, it may not be as intuitive to read as the calendar table method.

Conclusion

In conclusion, we have explored two possible methods for iterating through customers on a 12-months-rolling basis: using a calendar table and using generated series. Both approaches have their own set of limitations and advantages. The choice of which approach to use depends on the specific requirements of your project and the characteristics of your dataset.

Advice

When working with dates and times in SQL, it’s essential to consider the following best practices:

  • Always specify the date format when creating tables or inserting data.
  • Use DATE_TRUNC instead of MAX when comparing dates.
  • Consider using calendar tables or generated series to simplify date-based queries.

By following these best practices, you can write more efficient and maintainable SQL code that takes advantage of the latest features and improvements in your database system.


Last modified on 2023-10-23