How do you delete duplicate rows while keeping one record in SQL Server?
1 Answer
A common approach in SQL Server is to use a CTE with ROW_NUMBER() to identify duplicates and delete all but one row.
Example: Keep One Record, Delete Duplicates
Suppose you have this table:
CREATE TABLE Employees (
EmployeeID INT PRIMARY KEY,
Name VARCHAR(100),
Email VARCHAR(100)
);
If duplicate rows are defined by the same Email, you can do:
-- Identify duplicates
WITH DuplicateRows AS
(
SELECT
EmployeeID,
ROW_NUMBER() OVER (
PARTITION BY Email
ORDER BY EmployeeID
) AS rn
FROM Employees
)
-- Delete duplicates, keep the first row
DELETE FROM DuplicateRows
WHERE rn > 1;
How it works
PARTITION BY Emailgroups rows with the same email.ROW_NUMBER()assigns:1to the row you want to keep.2,3, etc. to duplicate rows.
DELETE WHERE rn > 1removes all duplicates while retaining one record.
Keep the Latest Record Instead
If you want to keep the most recent row based on a date column:
WITH DuplicateRows AS
(
SELECT
EmployeeID,
ROW_NUMBER() OVER (
PARTITION BY Email
ORDER BY CreatedDate DESC
) AS rn
FROM Employees
)
DELETE FROM DuplicateRows
WHERE rn > 1;
This keeps the newest record for each email.
Preview Before Deleting
It's a good practice to inspect the rows first:
WITH DuplicateRows AS
(
SELECT
*,
ROW_NUMBER() OVER (
PARTITION BY Email
ORDER BY EmployeeID
) AS rn
FROM Employees
)
SELECT *
FROM DuplicateRows
WHERE rn > 1;
This shows exactly which rows would be deleted before you run the DELETE statement.
Alternative: Remove Completely Identical Rows
If duplicates are based on all columns except the primary key:
WITH DuplicateRows AS
(
SELECT
EmployeeID,
ROW_NUMBER() OVER (
PARTITION BY Name, Email
ORDER BY EmployeeID
) AS rn
FROM Employees
)
DELETE FROM DuplicateRows
WHERE rn > 1;
This keeps one copy of each (Name, Email) combination and deletes the rest.