Sorting by Given “Rank” of Column Values
Introduction
Sorting data based on specific conditions is a common requirement in many applications. In this article, we will explore how to sort rows by giving a certain “rank” to column values.
We’ll start with a sample table and explain the problem statement. Then, we’ll dive into the SQL query solution provided and analyze it step-by-step. Finally, we’ll discuss additional considerations such as handling many other values for risk and exploring alternative data types like enum
.
Problem Statement
Suppose you have a table like this:
risk | category |
---|---|
Low | A |
Medium | B |
High | C |
Medium | A |
Low | B |
High | A |
Low | C |
Low | E |
Low | D |
High | B |
You need to sort rows by category
, but first, based on the value of risk
. The desired result should look like this:
risk | category |
---|---|
Low | A |
Low | B |
Low | C |
Low | D |
Low | E |
Medium | A |
Medium | B |
High | A |
High | B |
High | C |
SQL Query Solution
The provided SQL query solution is:
SELECT *
FROM some_table
ORDER BY CASE risk
WHEN 'Low' THEN 0
WHEN 'Medium' THEN 1
WHEN 'High' THEN 2
-- rest defaults to NULL and sorts last
END
, category;
A “switched” CASE
is shorter and slightly cheaper. In the absence of an ELSE
branch, all remaining cases default to NULL
, and NULL
sorts last in default ascending sort order. So you don’t need to do anything extra.
Handling Many Other Values
While the provided query seems okay for many other values of risk, it’s not ideal if there are many more values that need to be handled individually.
In this scenario, we can create an additional table riskrank
with a primary key
on the risk
column and a real
value. We can then insert each unique value of risk
into this table along with its corresponding rank.
Here’s how you can do it:
CREATE TABLE riskrank (
risk text PRIMARY KEY
, riskrank real
);
INSERT INTO riskrank VALUES
('Low' , 0)
, ('Medium', 1)
, ('High' , 2)
-- many more?
;
Then, your query becomes:
SELECT s.*
FROM some_table s
LEFT JOIN risk_rank rr USING (risk)
ORDER BY rr.riskrank, s.category;
The LEFT JOIN
ensures that missing entries in the riskrank
table do not eliminate rows from the result set.
Enum Data Type
enum
data type is another possible alternative for handling this problem. In an enum
, values are sorted based on their order of definition (not how they are spelled). This means that enum values only occupy 4 bytes on disk (real
internally), are fast, and enforce valid values implicitly.
However, we should consider the following:
- Enum types are primarily intended for static sets of values.
- Although there is support for adding new values to an existing enum type, and for renaming values (see
ALTER TYPE
), existing values cannot be removed from an enum type, nor can the sort ordering of such values be changed, short of dropping and re-creating the enum type.
Therefore, we should only consider using an enum
if the sort order of your values is immutable.
Last modified on 2025-05-07