How do you find duplicate records in a SQL Server table?
1 Answer
To find duplicate records in SQL Server, you typically use GROUP BY with
HAVING COUNT(*) > 1.
1. Find Duplicates Based on a Single Column
For example, to find duplicate emails:
SELECT
Email,
COUNT(*) AS DuplicateCount
FROM Employees
GROUP BY Email
HAVING COUNT(*) > 1;
Result:
| DuplicateCount | |
|---|---|
| john@example.com | 3 |
| jane@example.com | 2 |
2. Find Duplicates Based on Multiple Columns
For example, if duplicates are defined by the combination of FirstName and
LastName:
SELECT
FirstName,
LastName,
COUNT(*) AS DuplicateCount
FROM Employees
GROUP BY FirstName, LastName
HAVING COUNT(*) > 1;
3. View the Actual Duplicate Rows
To see all rows that participate in duplicates:
SELECT *
FROM Employees
WHERE Email IN
(
SELECT Email
FROM Employees
GROUP BY Email
HAVING COUNT(*) > 1
)
ORDER BY Email;
4. Find Duplicates Using ROW_NUMBER()
This method is useful when you later want to delete duplicates:
WITH DuplicateRows AS
(
SELECT
*,
ROW_NUMBER() OVER (
PARTITION BY Email
ORDER BY EmployeeID
) AS rn
FROM Employees
)
SELECT *
FROM DuplicateRows
WHERE rn > 1;
Rows with rn > 1 are duplicate records, while rn = 1 is the row that would typically be kept.
5. Find Completely Identical Rows
If you want to identify rows where all relevant columns match:
SELECT
FirstName,
LastName,
Email,
COUNT(*) AS DuplicateCount
FROM Employees
GROUP BY
FirstName,
LastName,
Email
HAVING COUNT(*) > 1;
This returns combinations of values that occur more than once in the table.