Understanding and Implementing Rownumber to Select Non-Duplicate Rows from a Table
In this article, we will explore how to use the ROW_NUMBER
function in SQL Server to select non-duplicate rows from a table. We will also discuss the error that occurs when trying to calculate date difference between two dates of different data types.
Introduction
The ROW_NUMBER
function is used to assign a unique number to each row within a partition of a result set. It can be used in combination with the PARTITION BY
clause to identify rows that are identical except for their values at certain columns.
In this article, we will use an example table named Emp_demo3
which contains employee information. The goal is to select non-duplicate rows from this table based on specific columns and then calculate the date difference between two dates.
Understanding the Problem
When we try to calculate the date difference in days between two dates, we encounter an error because SQL Server does not allow us to directly subtract one date type from another. This is a fundamental limitation of SQL Server’s data types.
The solution lies in transforming our date fields into a compatible format before performing the calculation.
Creating the Table and Data
First, let’s create the Emp_demo3
table and insert some sample data.
CREATE TABLE Emp_demo3 (
emp_ID INT,
emp_Name NVARCHAR (50),
emp_sal_K INT,
emp_manager INT,
joining_date DATE,
last_time DATE)
INSERT INTO Emp_demo3 VALUES (1,'Ali', 200,2,'2010-01-28','2015-05-09')
INSERT INTO Emp_demo3 VALUES (2,'Zaid', 770,4,'2008-01-28','2015-05-09')
INSERT INTO Emp_demo3 VALUES (3,'Mohd', 1140,2,'2007-01-28','2015-05-09')
INSERT INTO Emp_demo3 VALUES (4,'LILY', 770,Null,'2013-01-28','2015-05-09')
INSERT INTO Emp_demo3 VALUES (5,'John', 1240,6,'2016-01-28','2015-05-09')
INSERT INTO Emp_demo3 VALUES (6,'Mike', 1140,4,'2018-01-28','2015-05-09')
INSERT INTO Emp_demo3 VALUES (5,'John', 1240,6,'2017-01-28','2015-05-09')
INSERT INTO Emp_demo3 VALUES (3,'Mohd', 1140,2,'2010-01-28','2015-05-09')
Calculating Date Difference
We will use the DATEDIFF
function to calculate the date difference in days between two dates.
ALTER TABLE Emp_demo3
add date_diff DATE
UPDATE Emp_demo3
SET date_diff = DATEDIFF(DAY, joining_date, last_time)
However, when we try to execute this code, we get an error message: “Operand type clash: int is incompatible with date”.
This is because the DATEDIFF
function returns an integer value representing the number of days between two dates.
To solve this problem, we need to transform our date fields into a compatible format before performing the calculation.
Using ROW_NUMBER
We can use the ROW_NUMBER
function in combination with the PARTITION BY
clause to identify rows that are identical except for their values at certain columns.
Here’s how we can do it:
DECLARE @Emp_demo2 TABLE (
emp_ID INT,
emp_Name NVARCHAR (50),
emp_sal_K INT,
emp_manager INT)
INSERT INTO @Emp_demo2 VALUES (1,'Ali', 200,2)
INSERT INTO @Emp Demo2 VALUES (2,'Zaid', 770,4)
INSERT INTO @Emp_demo2 VALUES (3,'Mohd', 1140,2)
INSERT INTO @Emp_demo2 VALUES (4,'LILY', 770,Null)
INSERT INTO @Emp Demo2 VALUES (5,'John', 1240,6)
INSERT INTO @Emp Demo2 VALUES (6,'Mike', 1140,4)
INSERT INTO @Emp Demo2 VALUES (5,'John', 1240,6)
INSERT INTO @Emp Demo2 VALUES (3,'Mohd', 1140,2)
SELECT * FROM
(
SELECT
t.emp_ID
, t.emp_Name
, t.emp_sal_K
, t.emp_manager
, ROW_NUMBER() OVER (PARTITION BY t.emp_Name, t.emp_sal_K, t.emp_manager
ORDER BY t.emp_Name) AS RowNum
FROM @Emp Demo2 AS t
)q
WHERE q.RowNum = 1
ORDER BY q.emp_ID
In this code snippet, the ROW_NUMBER
function is used to assign a unique number to each row within a partition of the result set. The PARTITION BY
clause identifies rows that are identical except for their values at certain columns (emp_Name
, emp_sal_K
, and emp_manager
).
Last modified on 2024-08-04