2010-01-18

Find and/or Delete Duplicate Rows from SQL server

How do I identify duplicate rows?
Group the rows by CustName, Pincode. Rows having similar CustName and Pincode will have more than one rows in the grouping. So locate them using HAVING COUNT(*) > 1. If duplicate values are encountered, return the maximum ID for each duplicate row. Using the outer query, delete any ID returned by subquery.

1.SQL Server 2005 Solution:
-- Find Duplicate Rows
SELECT MAX(ID) as ID, CustName, Pincode FROM #Customers
GROUP BY CustName, Pincode
HAVING COUNT(*) > 1

-- Delete Duplicate Rows
DELETE FROM #Customers
WHERE ID IN
( SELECT MAX(ID) FROM #Customers
GROUP BY CustName, Pincode
HAVING COUNT(*) > 1)

2.-- Find Duplicate Rows
SELECT MAX(ID) as ID, CustName, Pincode FROM #Customers
GROUP BY CustName, Pincode
HAVING COUNT(*) > 1

-- Delete Duplicate Rows
DELETE FROM #Customers
WHERE ID IN
( SELECT MAX(ID) FROM #Customers
GROUP BY CustName, Pincode
HAVING COUNT(*) > 1)


Check out the full article:
http://code.msdn.microsoft.com/SQLExamples/Wiki/View.aspx?title=DuplicateRows

No comments: