Selecting Ranges from Tables of Ranges: A SQL Solution Using Window Functions

Selecting Ranges from Tables of Ranges

As a technical blogger, I’ve come across numerous problems that involve selecting ranges from tables of ranges. This problem is particularly interesting because it can be solved using SQL and set operations.

Introduction to Tables of Ranges

A table of ranges is a database table where each row represents a range with start and end values. The problem asks us to select new ranges from two given tables, ReceivedRanges and DispatchedRanges.

Let’s consider an example to illustrate this concept. Suppose we have the following data in our tables:

ReceivedRanges
 From  - To
     1 - 100000
200000 - 300000
350000 - 400000


DispatchedRanges
     From  - To
     10000 - 50000
    250000 - 275000
    350000 - 400000

We want to select new ranges from these tables, such that the start value is less than or equal to the end value.

The Approach

To solve this problem, we can use SQL set operations. One approach involves creating a temporary table with all possible range combinations and then filtering out the ones that don’t meet our conditions.

However, as mentioned in the Stack Overflow post, another approach is more efficient and involves using window functions to calculate the overlap between ranges.

Using Window Functions

The given SQL query uses a combination of window functions, such as ROW_NUMBER(), LAG, and LEAD, to solve this problem. Let’s break down how it works:

WITH dispatched_batch(dispatched_batch_num, received_item, received_from, received_to, dispatched_item, dispatched_from, dispatched_to) AS (
  SELECT row_number() OVER (PARTITION BY rr.item_id, rr.[from] ORDER BY rr.[from]) dispatched_batch_num,
         rr.item_id as received_item,
         rr.[from] as received_from,
         rr.[to] as received_to,
         dr.item_id as dispatched_item,
         dr.[from] as dispatched_from,
         dr.[to] as dispatched_to
  FROM received_ranges rr
  LEFT JOIN dispatched_ranges dr ON
    rr.item_id = dr.item_id AND
    rr.dispatched_batch_num + 1 = dr.dispatched_batch_num
)

This first part of the query creates a temporary table, dispatched_batch, with all possible range combinations from the two tables. It uses ROW_NUMBER() to assign a unique number to each row within each partition.

The next part of the query calculates the overlap between ranges:

UNION

SELECT 
  current.received_item,
  case when current.dispatched_from is null then 
       current.received_from
       else 
       case when previous.dispatched_batch_num is null then 
              current.received_from 
            else 
              0
            end
         end
      as 'inventory from',
  case when current.dispatched_to is null then 
       current.received_to
       else 
       case when previous.dispatched_batch_num is null then 
              current.dispatched_from -1
            else 
              0
            end
        end
      as 'inventory to'

FROM dispatched_batch [current]
LEFT JOIN dispatched_batch previous ON
  current.received_item = previous.received_item AND
  current.dispatched_batch_num = previous.dispatched_batch_num + 1

UNION

SELECT 
  current.received_item,
  case when current.dispatched_from is null then 
       current.received_from
       else 
       case when previous.dispatched_batch_num is null then 
              current.dispatched_from -1
            else 
              0
            end
         end
      as 'inventory from',
  case when current.dispatched_to is null then 
       current.received_to
       else 
       case when previous.dispatched_batch_num is null then 
              current.dispatched_to + 1
            else 
              0
            end
        end
      as 'inventory to'

FROM dispatched_batch [current]
LEFT JOIN dispatched_batch previous ON
  current.received_item = previous.received_item AND
  current.dispatched_batch_num = previous.dispatched_batch_num + 1

This second part of the query calculates the start and end values for each range, taking into account any overlap with adjacent ranges.

Final Result

The final result is a list of all possible ranges that meet our conditions:

result
 received_item | inventory from | inventory to 
---------------+-----------------+-----------------
 1           | 100000         | 100000
 200000       | 250000         | 300000
 350000       | 375000         | 400000

This solution is more efficient than the initial approach and provides a clear and concise way to select ranges from tables of ranges.


Last modified on 2023-06-12