将逗号分隔的string转换为单独的行
我有一个像这样的SQL表:
| SomeID | OtherID | Data +----------------+-------------+------------------- | abcdef-..... | cdef123-... | 18,20,22 | abcdef-..... | 4554a24-... | 17,19 | 987654-..... | 12324a2-... | 13,19,20
有没有一个查询,我可以执行一个查询,如SELECT OtherID, SplitData WHERE SomeID = 'abcdef-.......'
,返回单独的行,如下所示:
| OtherID | SplitData +-------------+------------------- | cdef123-... | 18 | cdef123-... | 20 | cdef123-... | 22 | 4554a24-... | 17 | 4554a24-... | 19
基本上我的数据在逗号分割成单行?
我知道将逗号分隔的string存储到关系数据库中听起来很愚蠢,但消费者应用程序中的正常使用情况确实很有帮助。
我不想在应用程序中进行拆分,因为我需要分页,所以我想在重构整个应用程序之前探索选项。
这是SQL Server 2008(非R2)。
您可以使用SQL Server的精彩recursion函数
示例表
create table Testdata(SomeID int, OtherID int, Data varchar(max)) insert Testdata select 1, 9, '18,20,22' insert Testdata select 2, 8, '17,19' insert Testdata select 3, 7, '13,19,20' insert Testdata select 4, 6, ''
查询
;with tmp(SomeID, OtherID, DataItem, Data) as ( select SomeID, OtherID, LEFT(Data, CHARINDEX(',',Data+',')-1), STUFF(Data, 1, CHARINDEX(',',Data+','), '') from Testdata union all select SomeID, OtherID, LEFT(Data, CHARINDEX(',',Data+',')-1), STUFF(Data, 1, CHARINDEX(',',Data+','), '') from tmp where Data > '' ) select SomeID, OtherID, DataItem from tmp order by SomeID -- OPTION (maxrecursion 0) -- normally recursion is limited to 100. If you know you have very long -- strings, uncomment the option
产量
SomeID OtherID DataItem 1 9 18 1 9 20 1 9 22 2 8 17 2 8 19 3 7 13 3 7 19 3 7 20 4 6 9 11 1 9 11 2 9 11 3 9 11 4 etc.
检查这个
SELECT A.OtherID, Split.a.value('.', 'VARCHAR(100)') AS Data FROM ( SELECT OtherID, CAST ('<M>' + REPLACE(Data, ',', '</M><M>') + '</M>' AS XML) AS Data FROM Table1 ) AS A CROSS APPLY Data.nodes ('/M') AS Split(a);
最后, SQL Server 2016的等待已经结束。 他们引入了分割string函数STRING_SPLIT
:
select OtherID, cs.Value --SplitData from yourtable cross apply STRING_SPLIT (Data, ',') cs
所有其他的方法来拆分string像XML,Tally表,while循环等等已被这个STRING_SPLIT
函数吹走了。
这里是一个很好的文章与性能比较: 性能惊喜和假设:STRING_SPLIT 。
对于旧版本,在这里使用计数表是一个拆分string函数(最好的方法)
CREATE FUNCTION [dbo].[DelimitedSplit8K] (@pString VARCHAR(8000), @pDelimiter CHAR(1)) RETURNS TABLE WITH SCHEMABINDING AS RETURN --===== "Inline" CTE Driven "Tally Table" produces values from 0 up to 10,000... -- enough to cover NVARCHAR(4000) WITH E1(N) AS ( SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 ), --10E+1 or 10 rows E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max cteTally(N) AS (--==== This provides the "base" CTE and limits the number of rows right up front -- for both a performance gain and prevention of accidental "overruns" SELECT TOP (ISNULL(DATALENGTH(@pString),0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4 ), cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter) SELECT 1 UNION ALL SELECT t.N+1 FROM cteTally t WHERE SUBSTRING(@pString,tN,1) = @pDelimiter ), cteLen(N1,L1) AS(--==== Return start and length (for use in substring) SELECT s.N1, ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,s.N1),0)-s.N1,8000) FROM cteStart s ) --===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found. SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY l.N1), Item = SUBSTRING(@pString, l.N1, l.L1) FROM cteLen l ;
来自Tally OH的推荐! 改进的SQL 8K“CSV分割器”function
select t.OtherID,x.Kod from testData t cross apply (select Code from dbo.Split(t.Data,',') ) x
截至2016年2月 – 请参阅TALLY表格示例 – 从2014年2月起很有可能超过我的TVF。保留以下原始post作为后人:
在上面的例子中,我喜欢的代码太多了。 我不喜欢CTE和XML的performance。 此外,一个显式的Id
,以便特定于ORDER BY
消费者可以指定一个ORDER BY
子句。
CREATE FUNCTION dbo.Split ( @Line nvarchar(MAX), @SplitOn nvarchar(5) = ',' ) RETURNS @RtnValue table ( Id INT NOT NULL IDENTITY(1,1) PRIMARY KEY CLUSTERED, Data nvarchar(100) NOT NULL ) AS BEGIN IF @Line IS NULL RETURN DECLARE @split_on_len INT = LEN(@SplitOn) DECLARE @start_at INT = 1 DECLARE @end_at INT DECLARE @data_len INT WHILE 1=1 BEGIN SET @end_at = CHARINDEX(@SplitOn,@Line,@start_at) SET @data_len = CASE @end_at WHEN 0 THEN LEN(@Line) ELSE @end_at-@start_at END INSERT INTO @RtnValue (data) VALUES( SUBSTRING(@Line,@start_at,@data_len) ); IF @end_at = 0 BREAK; SET @start_at = @end_at + @split_on_len END RETURN END
很高兴看到它已经在2016年版本中得到了解决,但是对于那些不在这个版本的所有的版本,这里有两个通用和简化版本的方法。
XML方法更短,但是当然需要string来允许xml技巧(没有“坏”字符)。
XML的方法:
create function dbo.splitString(@input Varchar(max), @Splitter VarChar(99)) returns table as Return SELECT Split.a.value('.', 'VARCHAR(max)') AS Data FROM ( SELECT CAST ('<M>' + REPLACE(@input, @Splitter, '</M><M>') + '</M>' AS XML) AS Data ) AS A CROSS APPLY Data.nodes ('/M') AS Split(a);
recursion方法:
create function dbo.splitString(@input Varchar(max), @Splitter Varchar(99)) returns table as Return with tmp (DataItem, ix) as ( select @input , CHARINDEX('',@Input) --Recu. start, ignored val to get the types right union all select Substring(@input, ix+1,ix2-ix-1), ix2 from (Select *, CHARINDEX(@Splitter,@Input+@Splitter,ix+1) ix2 from tmp) x where ix2<>0 ) select DataItem from tmp where ix<>0
function在行动
Create table TEST_X (A int, CSV Varchar(100)); Insert into test_x select 1, 'A,B'; Insert into test_x select 2, 'C,D'; Select A,data from TEST_X x cross apply dbo.splitString(x.CSV,',') Y; Drop table TEST_X
XML方法2:Unicode友好😀 (除了Max Hodges) create function dbo.splitString(@input nVarchar(max), @Splitter nVarchar(99)) returns table as Return SELECT Split.a.value('.', 'NVARCHAR(max)') AS Data FROM ( SELECT CAST ('<M>' + REPLACE(@input, @Splitter, '</M><M>') + '</M>' AS XML) AS Data ) AS A CROSS APPLY Data.nodes ('/M') AS Split(a);
DECLARE @id_list VARCHAR(MAX) = '1234,23,56,576,1231,567,122,87876,57553,1216' DECLARE @table TABLE ( id VARCHAR(50) ) DECLARE @x INT = 0 DECLARE @firstcomma INT = 0 DECLARE @nextcomma INT = 0 SET @x = LEN(@id_list) - LEN(REPLACE(@id_list, ',', '')) + 1 -- number of ids in id_list WHILE @x > 0 BEGIN SET @nextcomma = CASE WHEN CHARINDEX(',', @id_list, @firstcomma + 1) = 0 THEN LEN(@id_list) + 1 ELSE CHARINDEX(',', @id_list, @firstcomma + 1) END INSERT INTO @table VALUES ( SUBSTRING(@id_list, @firstcomma + 1, (@nextcomma - @firstcomma) - 1) ) SET @firstcomma = CHARINDEX(',', @id_list, @firstcomma + 1) SET @x = @x - 1 END SELECT * FROM @table
当使用这种方法时,您必须确保没有任何值包含非法的XML – user1151923
我总是使用XML方法。 确保你使用VALID XML。 我有两个函数在有效的XML和文本之间进行转换。 (我倾向于去掉回车,因为我通常不需要它们。
CREATE FUNCTION dbo.udf_ConvertTextToXML (@Text varchar(MAX)) RETURNS varchar(MAX) AS BEGIN SET @Text = REPLACE(@Text,CHAR(10),'') SET @Text = REPLACE(@Text,CHAR(13),'') SET @Text = REPLACE(@Text,'<','<') SET @Text = REPLACE(@Text,'&','&') SET @Text = REPLACE(@Text,'>','>') SET @Text = REPLACE(@Text,'''',''') SET @Text = REPLACE(@Text,'"','"') RETURN @Text END CREATE FUNCTION dbo.udf_ConvertTextFromXML (@Text VARCHAR(MAX)) RETURNS VARCHAR(max) AS BEGIN SET @Text = REPLACE(@Text,'<','<') SET @Text = REPLACE(@Text,'&','&') SET @Text = REPLACE(@Text,'>','>') SET @Text = REPLACE(@Text,''','''') SET @Text = REPLACE(@Text,'"','"') RETURN @Text END
下面的工作在SQL Server 2008
select *, ROW_NUMBER() OVER(order by items) as row# from ( select 134 myColumn1, 34 myColumn2, 'd,c,k,e,f,g,h,a' comaSeperatedColumn) myTable cross apply SPLIT (rtrim(comaSeperatedColumn), ',') splitedTable -- gives 'items' column
将所有笛卡尔产品与原始表列一起加上拆分表的“项目”。
;WITH tmp(SomeID, OtherID, DataItem, Data) as ( SELECT SomeID, OtherID, LEFT(Data, CHARINDEX(',',Data+',')-1), STUFF(Data, 1, CHARINDEX(',',Data+','), '') FROM Testdata WHERE Data > '' ) SELECT SomeID, OtherID, Data FROM tmp ORDER BY SomeID
只有很小的修改以上查询…