在SQL表中查找重复的值
用一个字段很容易find重复项:
SELECT name, COUNT(email) FROM users GROUP BY email HAVING ( COUNT(email) > 1 )
所以,如果我们有一张桌子
ID NAME EMAIL 1 John asd@asd.com 2 Sam asd@asd.com 3 Tom asd@asd.com 4 Bob bob@asd.com 5 Tom asd@asd.com
这个查询会给我们约翰,山姆,汤姆,汤姆,因为他们都有相同的email
。
但是,我想要的是使用相同的email
和 name
重复。
那就是我想得到“汤姆”,“汤姆”。
我需要这个原因:我犯了一个错误,并允许插入重复的name
和email
值。 现在我需要删除/更改重复,所以我需要先find他们。
SELECT name, email, COUNT(*) FROM users GROUP BY name, email HAVING COUNT(*) > 1
简单地在两个列上分组。
注意:ANSI标准是在GROUP BY中拥有所有非聚合列。 MySQL允许你避免这种情况,但结果是不可预测的:
- GROUP BY lname ORDER BY显示错误的结果
- 在ANY()的情况下,哪个是最便宜的聚集函数 (参见接受的答案中的注释)
在MySQL中,你需要sql_mode=only_full_group_by
尝试这个:
declare @YourTable table (id int, name varchar(10), email varchar(50)) INSERT @YourTable VALUES (1,'John','John-email') INSERT @YourTable VALUES (2,'John','John-email') INSERT @YourTable VALUES (3,'fred','John-email') INSERT @YourTable VALUES (4,'fred','fred-email') INSERT @YourTable VALUES (5,'sam','sam-email') INSERT @YourTable VALUES (6,'sam','sam-email') SELECT name,email, COUNT(*) AS CountOf FROM @YourTable GROUP BY name,email HAVING COUNT(*)>1
OUTPUT:
name email CountOf ---------- ----------- ----------- John John-email 2 sam sam-email 2 (2 row(s) affected)
如果你想使用这个dups的ID:
SELECT y.id,y.name,y.email FROM @YourTable y INNER JOIN (SELECT name,email, COUNT(*) AS CountOf FROM @YourTable GROUP BY name,email HAVING COUNT(*)>1 ) dt ON y.name=dt.name and y.email=dt.email
OUTPUT:
id name email ----------- ---------- ------------ 1 John John-email 2 John John-email 5 sam sam-email 6 sam sam-email (4 row(s) affected)
删除重复的尝试:
DELETE d FROM @YourTable d INNER JOIN (SELECT y.id,y.name,y.email,ROW_NUMBER() OVER(PARTITION BY y.name,y.email ORDER BY y.name,y.email,y.id) AS RowRank FROM @YourTable y INNER JOIN (SELECT name,email, COUNT(*) AS CountOf FROM @YourTable GROUP BY name,email HAVING COUNT(*)>1 ) dt ON y.name=dt.name and y.email=dt.email ) dt2 ON d.id=dt2.id WHERE dt2.RowRank!=1 select * FROM @YourTable
OUTPUT:
id name email ----------- ---------- -------------- 1 John John-email 3 fred John-email 4 fred fred-email 5 sam sam-email (4 row(s) affected)
尝试这个:
SELECT name, email FROM users GROUP BY name, email HAVING ( COUNT(*) > 1 )
如果你想删除重复项,这是一个更简单的方法来做到这一点,而不是find偶数/奇数行到三重子select:
SELECT id, name, email FROM users u, users u2 WHERE u.name = u2.name AND u.email = u2.email AND u.id > u2.id
如此删除:
DELETE FROM users WHERE id IN ( SELECT id/*, name, email*/ FROM users u, users u2 WHERE u.name = u2.name AND u.email = u2.email AND u.id > u2.id )
更容易阅读和理解恕我直言
注意:唯一的问题是,必须执行请求,直到没有行被删除,因为每次只删除每个副本中的一个
尝试以下操作:
SELECT * FROM ( SELECT Id, Name, Age, Comments, Row_Number() OVER(PARTITION BY Name, Age ORDER By Name) AS Rank FROM Customers ) AS B WHERE Rank>1
SELECT name, email FROM users WHERE email in (SELECT email FROM users GROUP BY email HAVING COUNT(*)>1)
晚了一点,但我发现一个非常酷的解决方法find所有重复的ID:
SELECT GROUP_CONCAT( id ) FROM users GROUP BY email HAVING ( COUNT(email) > 1 )
如果你与Oracle合作,这种方式将是更可取的:
create table my_users(id number, name varchar2(100), email varchar2(100)); insert into my_users values (1, 'John', 'asd@asd.com'); insert into my_users values (2, 'Sam', 'asd@asd.com'); insert into my_users values (3, 'Tom', 'asd@asd.com'); insert into my_users values (4, 'Bob', 'bob@asd.com'); insert into my_users values (5, 'Tom', 'asd@asd.com'); commit; select * from my_users where rowid not in (select min(rowid) from my_users group by name, email);
试试这个代码
WITH CTE AS ( SELECT Id, Name, Age, Comments, RN = ROW_NUMBER()OVER(PARTITION BY Name,Age ORDER BY ccn) FROM ccnmaster ) select * from CTE delete CTE WHERE RN > 1
这将从每组重复中select/删除除一条logging以外的所有重复logging。 因此,删除将留下所有唯一logging+每个重复logging组中的一条logging。
select重复:
SELECT * FROM table WHERE id NOT IN ( SELECT MIN(id) FROM table GROUP BY column1, column2 );
删除重复项目:
DELETE FROM table WHERE id NOT IN ( SELECT MIN(id) FROM table GROUP BY column1, column2 );
注意大量的logging,这可能会导致性能问题。
如果你想查看表中是否有重复的行,我使用下面的查询:
create table my_table(id int, name varchar(100), email varchar(100)); insert into my_table values (1, 'shekh', 'shekh@rms.com'); insert into my_table values (1, 'shekh', 'shekh@rms.com'); insert into my_table values (2, 'Aman', 'aman@rms.com'); insert into my_table values (3, 'Tom', 'tom@rms.com'); insert into my_table values (4, 'Raj', 'raj@rms.com'); Select COUNT(1) As Total_Rows from my_table Select Count(1) As Distinct_Rows from ( Select Distinct * from my_table) abc
我们如何计算重复值 要么重复2次或大于2次。只要数它们,不要小组。
像…一样简单
select COUNT(distinct col_01) from Table_01
select id,name,COUNT(*) from India group by Id,Name having COUNT(*)>1
这也应该工作,也许给它尝试。
Select * from Users a where EXISTS (Select * from Users b where ( a.name = b.name OR a.email = b.email) and a.ID != b.id)
尤其适合你的情况如果你search的邮件中有一些前缀或一般变化的副本,例如新域名。 那么你可以在这些列上使用replace()
这是我想到的简单的事情。 它使用公用表expression式(CTE)和分区窗口(我认为这些function在SQL 2008及更高版本中)。
这个例子find所有重复名称和dob的学生。 您要检查重复的字段进入OVER子句。 您可以在投影中包含所需的其他任何字段。
with cte (StudentId, Fname, LName, DOB, RowCnt) as ( SELECT StudentId, FirstName, LastName, DateOfBirth as DOB, SUM(1) OVER (Partition By FirstName, LastName, DateOfBirth) as RowCnt FROM tblStudent ) SELECT * from CTE where RowCnt > 1 ORDER BY DOB, LName
select name, email , case when ROW_NUMBER () over (partition by name, email order by name) > 1 then 'Yes' else 'No' end "duplicated ?" from users
select emp.ename, emp.empno, dept.loc from emp inner join dept on dept.deptno=emp.deptno inner join (select ename, count(*) from emp group by ename, deptno having count(*) > 1) t on emp.ename=t.ename order by emp.ename /
如果你想查找重复的数据(通过一个或多个标准),并select实际的行。
with MYCTE as ( SELECT DuplicateKey1 ,DuplicateKey2 --optional ,count(*) X FROM MyTable group by DuplicateKey1, DuplicateKey2 having count(*) > 1 ) SELECT E.* FROM MyTable E JOIN MYCTE cte ON E.DuplicateKey1=cte.DuplicateKey1 AND E.DuplicateKey2=cte.DuplicateKey2 ORDER BY E.DuplicateKey1, E.DuplicateKey2, CreatedAt
http://developer.azurewebsites.net/2014/09/better-sql-group-by-find-duplicate-data/
SELECT id, COUNT(id) FROM table1 GROUP BY id HAVING COUNT(id)>1;
我认为这将正常工作,以search特定列中的重复值。
SELECT * FROM users u where rowid = (select max(rowid) from users u1 where u.email=u1.email);
通过使用CTE我们也可以find像这样的重复值
with MyCTE as ( select Name,EmailId,ROW_NUMBER() over(PARTITION BY EmailId order by id) as Duplicate from [Employees] ) select * from MyCTE where Duplicate>1
SELECT FirstName, LastName, MobileNo, COUNT(1) as CNT FROM CUSTOMER GROUP BY FirstName, LastName, MobileNo HAVING COUNT(1) > 1;