sql查询返回两个表之间的差异
我试图比较两个表,SQL Server来validation一些数据。 我想从两个表中返回所有的行,其中数据是在一个或另一个。 本质上,我想显示所有的差异。 我需要检查这三个数据,名字,姓氏和产品。
我对SQL相当陌生,看起来很多我发现的解决scheme都是复杂的。 我不必担心NULL。
我开始尝试这样的事情:
SELECT DISTINCT [First Name], [Last Name], [Product Name] FROM [Temp Test Data] WHERE ([First Name] NOT IN (SELECT [First Name] FROM [Real Data]))
尽pipe如此,我仍然遇到了麻烦。
谢谢!
编辑:
基于@treaschf的答案,我一直在尝试使用以下查询的变体:
SELECT td.[First Name], td.[Last Name], td.[Product Name] FROM [Temp Test Data] td FULL OUTER JOIN [Data] AS d ON td.[First Name] = d.[First Name] AND td.[Last Name] = d.[Last Name] WHERE (d.[First Name] = NULL) AND (d.[Last Name] = NULL)
但是我一直得到0的结果,当我知道td中至less有一行不在d中。
编辑:
好的,我想我明白了。 至less在几分钟的testing中,似乎工作得不错。
SELECT [First Name], [Last Name] FROM [Temp Test Data] AS td WHERE (NOT EXISTS (SELECT [First Name], [Last Name] FROM [Data] AS d WHERE ([First Name] = td.[First Name]) OR ([Last Name] = td.[Last Name])))
这基本上会告诉我什么是我的testing数据,是不是在我的真实数据。 对于我需要做的事情,这是完全正确的。
如果你有表A
和B
,都与列C
,这里是logging,这是存在于表A
但不是在B
:
SELECT A.* FROM A LEFT JOIN B ON (AC = BC) WHERE BC IS NULL
为了获得单个查询的所有差异,必须使用完全连接,如下所示:
SELECT A.*, B.* FROM A FULL JOIN B ON (AC = BC) WHERE AC IS NULL OR BC IS NULL
在这种情况下你需要知道的是,当一个logging可以在A
find,而不是在B
,而来自B
的列将是NULL,同样对于那些存在于B
而不是在A
, A
的列将为空。
( SELECT * FROM table1 EXCEPT SELECT * FROM table2) UNION ALL ( SELECT * FROM table2 EXCEPT SELECT * FROM table1)
我知道这可能不是一个stream行的答案,但我同意@Randy Minder在需要更复杂的比较时使用第三方工具。
这里的具体情况很简单,在这种情况下,这样的工具是不需要的,但是如果在两台服务器上引入更多的列,数据库,更复杂的比较标准等,这个问题就会变得很复杂。
有很多这些工具,例如ApexSQL Data Diff或Quest Toad ,您可以随时在试用模式下使用它们来完成工作。
如果你想得到哪些列值是不同的,你可以使用实体属性值模型:
declare @Data1 xml, @Data2 xml select @Data1 = ( select * from (select * from Test1 except select * from Test2) as a for xml raw('Data') ) select @Data2 = ( select * from (select * from Test2 except select * from Test1) as a for xml raw('Data') ) ;with CTE1 as ( select TCvalue('../@ID', 'bigint') as ID, TCvalue('local-name(.)', 'nvarchar(128)') as Name, TCvalue('.', 'nvarchar(max)') as Value from @Data1.nodes('Data/@*') as T(C) ), CTE2 as ( select TCvalue('../@ID', 'bigint') as ID, TCvalue('local-name(.)', 'nvarchar(128)') as Name, TCvalue('.', 'nvarchar(max)') as Value from @Data2.nodes('Data/@*') as T(C) ) select isnull(C1.ID, C2.ID) as ID, isnull(C1.Name, C2.Name) as Name, C1.Value as Value1, C2.Value as Value2 from CTE1 as C1 full outer join CTE2 as C2 on C2.ID = C1.ID and C2.Name = C1.Name where not ( C1.Value is null and C2.Value is null or C1.Value is not null and C2.Value is not null and C1.Value = C2.Value )
SQL FIDDLE示例
为了获得两个表之间的所有差异,可以像我一样使用这个SQL请求:
SELECT 'TABLE1-ONLY' AS SRC, T1.* FROM ( SELECT * FROM Table1 EXCEPT SELECT * FROM Table2 ) AS T1 UNION ALL SELECT 'TABLE2-ONLY' AS SRC, T2.* FROM ( SELECT * FROM Table2 EXCEPT SELECT * FROM Table1 ) AS T2 ;
尝试这个 :
SELECT [First Name], [Last Name] FROM [Temp Test Data] AS td EXCEPTION JOIN [Data] AS d ON (d.[First Name] = td.[First Name] OR d.[Last Name] = td.[Last Name])
更简单的阅读。
这样做将会起到类似Tiago解决scheme的作用,也会返回“源”表。
select [First name], [Last name], max(_tabloc) as _tabloc from ( select [First Name], [Last name], 't1' as _tabloc from table1 union all select [First name], [Last name], 't2' as _tabloc from table2 ) v group by [Fist Name], [Last name] having count(1)=1
结果将包含表之间的差异,在_tabloc列你将有表引用。
在@erikkallen答案简单的变化,显示行中存在的行:
( SELECT 'table1' as source, * FROM table1 EXCEPT SELECT * FROM table2) UNION ALL ( SELECT 'table2' as source, * FROM table2 EXCEPT SELECT * FROM table1)
对于一个简单的烟雾testing,你试图确保两个表匹配不用担心列名:
--ensure tables have matching records Select count (*) from tbl_A Select count (*) from tbl_B --create temp table of all records in both tables Select * into #demo from tbl_A Union All Select * from tbl_B --Distinct #demo records = Total #demo records/2 = Total tbl_A records = total tbl_B records Select distinct * from #demo
您可以轻松编写一个存储过程来比较一批表。
与左连接有关的性能问题以及与大数据的完全连接。
在我看来这是最好的解决scheme:
select [First Name], count(1) e from (select * from [Temp Test Data] union all select * from [Temp Test Data 2]) a group by [First Name] having e = 1