Optimizing SQL Queries to Retrieve Employee Work Details

Understanding the Problem

The problem at hand is to retrieve employee work details consisting of start and end dates, hours worked, and hourly rate for a specific employee ID. The data is stored in three tables: employees_list, hourlyRates, and workingHours. We need to join these tables based on common columns and filter the results for a specific employee ID.

Table Creation and Data Insertion

First, let’s create the necessary tables and insert some sample data:

CREATE TABLE employees_list (
    employeeID int identity(1,1),
    employeeName varchar(25)
)

INSERT INTO employees_list VALUES ('Kevin'),('Charles')

CREATE TABLE hourlyRates (
    employeeID int,
    rate int,
    rateDate date
)

INSERT INTO hourlyRates VALUES (1, 28, '2016-01-01'), 
                              (1, 39, '2016-02-01'),  
                              (2, 43, '2016-01-01'),  
                              (2, 57, '2016-02-01')

CREATE TABLE workingHours (
    employeeID int,
    startdate datetime,
    enddate datetime
)

INSERT INTO workingHours VALUES (1, '2016-01-01 09:00', '2016-01-01 17:00'), 
                               (1, '2016-01-02 09:00', '2016-01-02 17:00'), 
                               (1, '2016-02-01 10:00', '2016-02-01 16:00'), 
                               (1, '2016-02-02 11:00', '2016-02-02 13:00'),  
                               (2, '2016-01-01 10:00', '2016-01-01 16:00'), 
                               (2, '2016-01-02 08:00', '2016-01-02 14:00'), 
                               (2, '2016-02-01 14:00', '2016-02-01 19:00'),  
                               (2, '2016-02-02 13:00', '2016-02-02 16:00')

Initial Query

The initial query attempts to join the three tables and retrieve the required data for employee ID 1:

SELECT 
    workingHours.employeeID, 
    employeeName,
    startdate, enddate,
    DATEDIFF(HOUR, startdate, enddate) AS 'Hours Worked',
    rate AS 'Hourly Rate'
FROM 
    hourlyRates, workingHours, employees_list
WHERE 
    hourlyRates.employeeID = workingHours.employeeID
    AND employees_list.employeeID = workingHours.employeeID
    AND workingHours.employeeID = 1

Problem with the Initial Query

The query returns duplicated results because it joins the tables based only on employeeID. We need to join the tables based on other columns as well, such as rateDate and dates.

Modified Query with Additional Join Conditions

To fix the duplication issue, we modify the query by adding another join condition that filters data based on the rate date:

SELECT 
    workingHours.employeeID, 
    employeeName,
    startdate, enddate,
    DATEDIFF(HOUR, startdate, enddate) AS 'Hours Worked',
    rate AS 'Hourly Rate'
FROM 
    hourlyRates, workingHours, employees_list
WHERE 
    hourlyRates.employeeID = workingHours.employeeID
    AND employees_list.employeeID = workingHours.employeeID
    AND (hourlyRates.rateDate
        BETWEEN 
        DATEFROMPARTS(DATEPART(YEAR, workingHours.startDate), DATEPART(MONTH,workingHours.startDate), 1) 
        AND DATEFROMPARTS(DATEPART(YEAR, workingHours.endDate), DATEPART(MONTH,workingHours.endDate), 1)) 
    AND workingHours.employeeID = 1

Limitations of the Modified Query

Although the modified query resolves the duplication issue, it has some limitations:

  • The DATEFROMPARTS function and date calculations may not be optimized for performance.
  • The query is non-SARGable due to the use of the BETWEEN operator with a function call on column values.

Optimizing the Query using SARGable Join Conditions

To improve the performance and make the query SARGable, we can create temporary tables or indexes that contain pre-processed data. Here’s an example using temporary tables:

SELECT el.employeeID,
    el.employeeName,
    wh.startdate,
    wh.enddate,
    DATEDIFF(HOUR, wh.startdate, wh.enddate) AS [Hours Worked],
    hr.rate AS [Hourly Rate]
FROM employees_list el 
    JOIN hourlyRates hr ON el.employeeID = hr.employeeID
    JOIN #TempWorkingHours twh ON el.employeeID = twh.employeeID
        AND (hr.rateDate
            BETWEEN 
            twh.startDateYearMonth 
            AND twh.endDateMonthYearMonth
        ) 
WHERE el.employeeID = 1

CREATE TABLE #TempWorkingHours (
    employeeID int,
    startDateYearMonth date,
    endDateMonthYearMonth date,
    startdate datetime,
    enddate datetime
)

INSERT INTO #TempWorkingHours (employeeID, startDateYearMonth, endDateMonthYearMonth, startdate, enddate)
SELECT wh.employeeID, DATEFROMPARTS(DATEPART(YEAR, wh.startDate), DATEPART(MONTH,wh.startDate), 1) AS startDateYearMonth,
        DATEFROMPARTS(DATEPART(YEAR, wh.endDate), DATEPART(MONTH,wh.endDate), 1) AS endDateMonthYearMonth,
        wh.startdate,
        wh.enddate
FROM workingHours wh

SELECT el.employeeID
       ,el.employeeName
       ,twh.startDateYearMonth
       ,twh.endDateMonthYearMonth
       ,twh.startdate
       ,twh.enddate
INTO #TempWorkingHours
FROM employees_list el 
    JOIN hourlyRates hr ON el.employeeID = hr.employeeID
WHERE el.employeeID = 1

SELECT el.employeeID,
    el.employeeName,
    twh.startDateYearMonth,
    twh.endDateMonthYearMonth,
    twh.startdate,
    twh.enddate,
    DATEDIFF(HOUR, twh.startdate, twh.enddate) AS [Hours Worked],
    hr.rate AS [Hourly Rate]
FROM employees_list el 
    JOIN hourlyRates hr ON el.employeeID = hr.employeeID
    JOIN #TempWorkingHours twh ON el.employeeID = twh.employeeID
        AND (hr.rateDate
            BETWEEN 
            twh.startDateYearMonth 
            AND twh.endDateMonthYearMonth
        ) 
WHERE el.employeeID = 1

DROP TABLE #TempWorkingHours

Using New Join Syntax and Optimizing for SARGability

To further optimize the query, we can use new join syntax and indexes on pre-processed columns:

SELECT el.employeeID,
    el.employeeName,
    wh.startdate,
    wh.enddate,
    DATEDIFF(HOUR, wh.startdate, wh.enddate) AS [Hours Worked],
    hr.rate AS [Hourly Rate]
FROM employees_list el 
    JOIN hourlyRates hr ON el.employeeID = hr.employeeID
    JOIN workingHours wh ON hr.employeeID = wh.EmployeeID 
        AND (hr.rateDate
            BETWEEN 
            DATEFROMPARTS(DATEPART(YEAR, wh.startDate), DATEPART(MONTH,wh.startDate), 1) 
            AND DATEFROMPARTS(DATEPART(YEAR, wh.endDate), DATEPART(MONTH,wh.endDate), 1)
        ) 
WHERE el.employeeID = 1

This query uses new join syntax and indexes on pre-processed columns to improve performance. It also includes date calculations using the DATEFROMPARTS function.

Conclusion

In this article, we discussed how to avoid duplication while trying to display employee work details consisting of start and end dates, hours worked, and hourly rate in SQL Server. We explored different query approaches, including the use of temporary tables, indexes, new join syntax, and date calculations using DATEFROMPARTS. By understanding the limitations of each approach and optimizing our queries for SARGability, we can improve performance and ensure accurate results.


Last modified on 2025-03-28