在MySQL中查找重复的值

我有一个varchar列的表,我想查找所有在这个列中有重复值的logging。 什么是我可以用来find重复的最好的查询?

GROUP BY子句做一个SELECT 。 比方说, 名字是你想find重复的列:

 SELECT name, COUNT(*) c FROM table GROUP BY name HAVING c > 1; 

这将返回第一列中名称值的结果,并在第二列中显示该值的次数。

 SELECT varchar_col FROM table GROUP BY varchar_col HAVING count(*) > 1; 
 SELECT * FROM mytable mto WHERE EXISTS ( SELECT 1 FROM mytable mti WHERE mti.varchar_column = mto.varchar_column LIMIT 1, 1 ) 

这个查询返回完整的logging,而不仅仅是独特的varchar_column

此查询不使用COUNT(*) 。 如果有很多重复项, COUNT(*)很贵,而且不需要整个COUNT(*) ,只需要知道是否有两行具有相同的值。

具有varchar_column上的索引当然会大大加快此查询。

build立closureslevik的答案获取重复行的ID可以做一个GROUP_CONCAT如果你的服务器支持它(这将返回一个逗号分隔的ID列表)。

 SELECT GROUP_CONCAT(id), name, COUNT(*) c FROM documents GROUP BY name HAVING c > 1; 

假设你的表名为TableABC,你想要的列是Col,T1的主键是Key。

 SELECT a.Key, b.Key, a.Col FROM TableABC a, TableABC b WHERE a.Col = b.Col AND a.Key <> b.Key 

这个方法比上面的答案的优点是它给出了密钥。

要查找员工名称栏中有多lesslogging是重复的,下面的查询是有帮助的;

 Select name from employee group by name having count(*)>1; 
 SELECT * FROM `dps` WHERE pid IN (SELECT pid FROM `dps` GROUP BY pid HAVING COUNT(pid)>1) 

以下将查找多次使用的所有product_id。 您只会为每个product_id获取单个logging。

 SELECT product_id FROM oc_product_reward GROUP BY product_id HAVING count( product_id ) >1 

代码取自: http : //chandreshrana.blogspot.in/2014/12/find-duplicate-records-based-on-any.html

我看到上面的结果和查询将正常工作,如果你需要检查单列值是重复的。 例如电子邮件。

但是,如果你需要检查更多的列,并希望检查结果的组合,所以这个查询将正常工作:

 SELECT COUNT(CONCAT(name,email)) AS tot, name, email FROM users GROUP BY CONCAT(name,email) HAVING tot>1 (This query will SHOW the USER list which ARE greater THAN 1 AND also COUNT) 
 SELECT t.*,(select count(*) from city as tt where tt.name=t.name) as count FROM `city` as t where ( select count(*) from city as tt where tt.name=t.name ) > 1 order by count desc 

用你的表replace城市 。 用您的字段名称replace名称

 SELECT t.*, (SELECT COUNT(*) FROM city AS tt WHERE tt.name=t.name) AS count FROM `city` AS t WHERE (SELECT count(*) FROM city AS tt WHERE tt.name=t.name) > 1 ORDER BY count DESC 
 SELECT DISTINCT a.email FROM `users` a LEFT JOIN `users` b ON a.email = b.email WHERE a.id != b.id; 
 CREATE TABLE tbl_master (`id` int, `email` varchar(15)); INSERT INTO tbl_master (`id`, `email`) VALUES (1, 'test1@gmail.com'), (2, 'test2@gmail.com'), (3, 'test1@gmail.com'), (4, 'test2@gmail.com'), (5, 'test5@gmail.com'); QUERY : SELECT id, email FROM tbl_master WHERE email IN (SELECT email FROM tbl_master GROUP BY email HAVING COUNT(id) > 1) 

我的最后一个查询包含了一些帮助的答案 – 将group by,count和GROUP_CONCAT结合起来。

 SELECT GROUP_CONCAT(id), `magento_simple`, COUNT(*) c FROM product_variant GROUP BY `magento_simple` HAVING c > 1; 

这提供了两个例子(逗号分隔)的id,我需要的条码,以及有多less重复。

相应地更改表格和列。

 SELECT ColumnA, COUNT( * ) FROM Table GROUP BY ColumnA HAVING COUNT( * ) > 0 

要删除具有多个字段的重复行,请首先将其replace为为唯一不同行指定的新唯一键,然后使用“group by”命令删除具有相同新唯一键的重复行:

 Create TEMPORARY table tmp select concat(f1,f2) as cfs,t1.* from mytable as t1; Create index x_tmp_cfs on tmp(cfs); Create table unduptable select f1,f2,... from tmp group by cfs; 

一个很晚的贡献……以防止任何人下线…我有一个任务在银行应用程序中find匹配的交易对(实际上账户到账户转账的双方),以确定哪些是每个转帐交易的“从”和“到”,所以我们结束了这个:

 SELECT LEAST(primaryid, secondaryid) AS transactionid1, GREATEST(primaryid, secondaryid) AS transactionid2 FROM ( SELECT table1.transactionid AS primaryid, table2.transactionid AS secondaryid FROM financial_transactions table1 INNER JOIN financial_transactions table2 ON table1.accountid = table2.accountid AND table1.transactionid <> table2.transactionid AND table1.transactiondate = table2.transactiondate AND table1.sourceref = table2.destinationref AND table1.amount = (0 - table2.amount) ) AS DuplicateResultsTable GROUP BY transactionid1 ORDER BY transactionid1; 

结果是DuplicateResultsTable提供了包含匹配(即重复)事务的行,但是它也提供了相同的事务ID,第二次匹配同一对,所以外部SELECT在那里按照第一个事务ID进行分组是通过使用LEASTGREATEST来确保两个transactionid在结果中始终保持相同的顺序,这使得第一个对GROUP进行安全处理,从而消除了所有重复的匹配。 经过近百万条logging,在不到2秒的时间内确定了12,000多场比赛。 当然,transactionid是主要的索引,真的有帮助。

 Select column_name, column_name1,column_name2, count(1) as temp from table_name group by column_name having temp > 1 

进一步考虑@ maxyfc的回答 ,我需要find所有与重复值一起返回的行,所以我可以在MySQL Workbench中编辑它们:

 SELECT * FROM table WHERE field IN ( SELECT field FROM table GROUP BY field HAVING count(*) > 1 ) ORDER BY field 
 SELECT DISTINCT name, count(name) as times FROM yourtable GROUP BY name