Understanding the Problem
The problem at hand is to retrieve employee work details consisting of start and end dates, hours worked, and hourly rate for a specific employee ID. The data is stored in three tables: employees_list
, hourlyRates
, and workingHours
. We need to join these tables based on common columns and filter the results for a specific employee ID.
Table Creation and Data Insertion
First, let’s create the necessary tables and insert some sample data:
CREATE TABLE employees_list (
employeeID int identity(1,1),
employeeName varchar(25)
)
INSERT INTO employees_list VALUES ('Kevin'),('Charles')
CREATE TABLE hourlyRates (
employeeID int,
rate int,
rateDate date
)
INSERT INTO hourlyRates VALUES (1, 28, '2016-01-01'),
(1, 39, '2016-02-01'),
(2, 43, '2016-01-01'),
(2, 57, '2016-02-01')
CREATE TABLE workingHours (
employeeID int,
startdate datetime,
enddate datetime
)
INSERT INTO workingHours VALUES (1, '2016-01-01 09:00', '2016-01-01 17:00'),
(1, '2016-01-02 09:00', '2016-01-02 17:00'),
(1, '2016-02-01 10:00', '2016-02-01 16:00'),
(1, '2016-02-02 11:00', '2016-02-02 13:00'),
(2, '2016-01-01 10:00', '2016-01-01 16:00'),
(2, '2016-01-02 08:00', '2016-01-02 14:00'),
(2, '2016-02-01 14:00', '2016-02-01 19:00'),
(2, '2016-02-02 13:00', '2016-02-02 16:00')
Initial Query
The initial query attempts to join the three tables and retrieve the required data for employee ID 1:
SELECT
workingHours.employeeID,
employeeName,
startdate, enddate,
DATEDIFF(HOUR, startdate, enddate) AS 'Hours Worked',
rate AS 'Hourly Rate'
FROM
hourlyRates, workingHours, employees_list
WHERE
hourlyRates.employeeID = workingHours.employeeID
AND employees_list.employeeID = workingHours.employeeID
AND workingHours.employeeID = 1
Problem with the Initial Query
The query returns duplicated results because it joins the tables based only on employeeID
. We need to join the tables based on other columns as well, such as rateDate
and dates.
Modified Query with Additional Join Conditions
To fix the duplication issue, we modify the query by adding another join condition that filters data based on the rate date:
SELECT
workingHours.employeeID,
employeeName,
startdate, enddate,
DATEDIFF(HOUR, startdate, enddate) AS 'Hours Worked',
rate AS 'Hourly Rate'
FROM
hourlyRates, workingHours, employees_list
WHERE
hourlyRates.employeeID = workingHours.employeeID
AND employees_list.employeeID = workingHours.employeeID
AND (hourlyRates.rateDate
BETWEEN
DATEFROMPARTS(DATEPART(YEAR, workingHours.startDate), DATEPART(MONTH,workingHours.startDate), 1)
AND DATEFROMPARTS(DATEPART(YEAR, workingHours.endDate), DATEPART(MONTH,workingHours.endDate), 1))
AND workingHours.employeeID = 1
Limitations of the Modified Query
Although the modified query resolves the duplication issue, it has some limitations:
- The
DATEFROMPARTS
function and date calculations may not be optimized for performance. - The query is non-SARGable due to the use of the
BETWEEN
operator with a function call on column values.
Optimizing the Query using SARGable Join Conditions
To improve the performance and make the query SARGable, we can create temporary tables or indexes that contain pre-processed data. Here’s an example using temporary tables:
SELECT el.employeeID,
el.employeeName,
wh.startdate,
wh.enddate,
DATEDIFF(HOUR, wh.startdate, wh.enddate) AS [Hours Worked],
hr.rate AS [Hourly Rate]
FROM employees_list el
JOIN hourlyRates hr ON el.employeeID = hr.employeeID
JOIN #TempWorkingHours twh ON el.employeeID = twh.employeeID
AND (hr.rateDate
BETWEEN
twh.startDateYearMonth
AND twh.endDateMonthYearMonth
)
WHERE el.employeeID = 1
CREATE TABLE #TempWorkingHours (
employeeID int,
startDateYearMonth date,
endDateMonthYearMonth date,
startdate datetime,
enddate datetime
)
INSERT INTO #TempWorkingHours (employeeID, startDateYearMonth, endDateMonthYearMonth, startdate, enddate)
SELECT wh.employeeID, DATEFROMPARTS(DATEPART(YEAR, wh.startDate), DATEPART(MONTH,wh.startDate), 1) AS startDateYearMonth,
DATEFROMPARTS(DATEPART(YEAR, wh.endDate), DATEPART(MONTH,wh.endDate), 1) AS endDateMonthYearMonth,
wh.startdate,
wh.enddate
FROM workingHours wh
SELECT el.employeeID
,el.employeeName
,twh.startDateYearMonth
,twh.endDateMonthYearMonth
,twh.startdate
,twh.enddate
INTO #TempWorkingHours
FROM employees_list el
JOIN hourlyRates hr ON el.employeeID = hr.employeeID
WHERE el.employeeID = 1
SELECT el.employeeID,
el.employeeName,
twh.startDateYearMonth,
twh.endDateMonthYearMonth,
twh.startdate,
twh.enddate,
DATEDIFF(HOUR, twh.startdate, twh.enddate) AS [Hours Worked],
hr.rate AS [Hourly Rate]
FROM employees_list el
JOIN hourlyRates hr ON el.employeeID = hr.employeeID
JOIN #TempWorkingHours twh ON el.employeeID = twh.employeeID
AND (hr.rateDate
BETWEEN
twh.startDateYearMonth
AND twh.endDateMonthYearMonth
)
WHERE el.employeeID = 1
DROP TABLE #TempWorkingHours
Using New Join Syntax and Optimizing for SARGability
To further optimize the query, we can use new join syntax and indexes on pre-processed columns:
SELECT el.employeeID,
el.employeeName,
wh.startdate,
wh.enddate,
DATEDIFF(HOUR, wh.startdate, wh.enddate) AS [Hours Worked],
hr.rate AS [Hourly Rate]
FROM employees_list el
JOIN hourlyRates hr ON el.employeeID = hr.employeeID
JOIN workingHours wh ON hr.employeeID = wh.EmployeeID
AND (hr.rateDate
BETWEEN
DATEFROMPARTS(DATEPART(YEAR, wh.startDate), DATEPART(MONTH,wh.startDate), 1)
AND DATEFROMPARTS(DATEPART(YEAR, wh.endDate), DATEPART(MONTH,wh.endDate), 1)
)
WHERE el.employeeID = 1
This query uses new join syntax and indexes on pre-processed columns to improve performance. It also includes date calculations using the DATEFROMPARTS
function.
Conclusion
In this article, we discussed how to avoid duplication while trying to display employee work details consisting of start and end dates, hours worked, and hourly rate in SQL Server. We explored different query approaches, including the use of temporary tables, indexes, new join syntax, and date calculations using DATEFROMPARTS
. By understanding the limitations of each approach and optimizing our queries for SARGability, we can improve performance and ensure accurate results.
Last modified on 2025-03-28