Sorting Data by Risk Level: A Comprehensive Guide to SQL Solutions

Sorting by Given “Rank” of Column Values

Introduction

Sorting data based on specific conditions is a common requirement in many applications. In this article, we will explore how to sort rows by giving a certain “rank” to column values.

We’ll start with a sample table and explain the problem statement. Then, we’ll dive into the SQL query solution provided and analyze it step-by-step. Finally, we’ll discuss additional considerations such as handling many other values for risk and exploring alternative data types like enum.

Problem Statement

Suppose you have a table like this:

riskcategory
LowA
MediumB
HighC
MediumA
LowB
HighA
LowC
LowE
LowD
HighB

You need to sort rows by category, but first, based on the value of risk. The desired result should look like this:

riskcategory
LowA
LowB
LowC
LowD
LowE
MediumA
MediumB
HighA
HighB
HighC

SQL Query Solution

The provided SQL query solution is:

SELECT *
FROM   some_table
ORDER  BY CASE risk
            WHEN 'Low'    THEN 0
            WHEN 'Medium' THEN 1
            WHEN 'High'   THEN 2
            -- rest defaults to NULL and sorts last
          END
        , category;

A “switched” CASE is shorter and slightly cheaper. In the absence of an ELSE branch, all remaining cases default to NULL, and NULL sorts last in default ascending sort order. So you don’t need to do anything extra.

Handling Many Other Values

While the provided query seems okay for many other values of risk, it’s not ideal if there are many more values that need to be handled individually.

In this scenario, we can create an additional table riskrank with a primary key on the risk column and a real value. We can then insert each unique value of risk into this table along with its corresponding rank.

Here’s how you can do it:

CREATE TABLE riskrank (
  risk text PRIMARY KEY
, riskrank real
);

INSERT INTO riskrank VALUES
  ('Low'   , 0)
, ('Medium', 1)
, ('High'  , 2)
-- many more?
; 

Then, your query becomes:

SELECT s.*
FROM   some_table s
LEFT   JOIN risk_rank rr USING (risk)
ORDER  BY rr.riskrank, s.category;

The LEFT JOIN ensures that missing entries in the riskrank table do not eliminate rows from the result set.

Enum Data Type

enum data type is another possible alternative for handling this problem. In an enum, values are sorted based on their order of definition (not how they are spelled). This means that enum values only occupy 4 bytes on disk (real internally), are fast, and enforce valid values implicitly.

However, we should consider the following:

  • Enum types are primarily intended for static sets of values.
  • Although there is support for adding new values to an existing enum type, and for renaming values (see ALTER TYPE), existing values cannot be removed from an enum type, nor can the sort ordering of such values be changed, short of dropping and re-creating the enum type.

Therefore, we should only consider using an enum if the sort order of your values is immutable.


Last modified on 2025-05-07