在oracle中使用LISTAGG来返回不同的值

我正在尝试使用Oracle中的LISTAGG函数。 我想只获得该列的不同值。 有没有一种方法可以在不创build函数或过程的情况下获得唯一的值?

   col1 col2 Created_by
    1 2史密斯 
    1 2约翰 
    1 3 Ajay 
    1 4 Ram 
    1 5杰克 

我需要selectcol1和col2的LISTAGG (不考虑第3列)。 当我这样做,我得到这样的结果LISTAGG[2,2,3,4,5]

我需要在这里删除重复的“2” 我只需要col1与col1不同的值。

你的意思是这样的:

 select listagg(the_column, ',') within group (order by the_column) from ( select distinct the_column from the_table ) t 

如果你需要更多的列,像这样的东西可能是你正在寻找的东西:

 select col1, listagg(col2, ',') within group (order by col2) from ( select col1, col2, row_number() over (partition by col1, col2 order by col1) as rn from foo order by col1,col2 ) where rn = 1 group by col1; 

以下是如何解决您的问题。

 select regexp_replace( '2,2,2.1,3,3,3,3,4,4' ,'([^,]+)(,\1)*(,|$)', '\1\3') from dual 

回报

2,2.1,3,4

答案(见下面的注释):

 select col1, regexp_replace( listagg( col2 , ',') within group (order by col2) -- sorted ,'([^,]+)(,\1)*(,|$)', '\1\3') ) from tableX where rn = 1 group by col1; 

注意:上面的方法在大多数情况下都是有效的 – 列表应该sorting,您可能需要根据数据修剪所有尾随和前导空间。

如果你的组中有大量的项目大于20或者大的string大小,你可能会遇到oraclestring大小限制'string连接的结果太长'因此,在每个组中的成员放置一个最大数量。 这只有在只列出第一个成员时才有效。 如果你有很长的variablesstring,这可能无法正常工作。 你将不得不尝试。

 select col1, case when count(col2) < 100 then regexp_replace( listagg(col2, ',') within group (order by col2) ,'([^,]+)(,\1)*(,|$)', '\1\3') else 'Too many entries to list...' end from sometable where rn = 1 group by col1; 

另一个解决scheme(不是那么简单)希望避免oraclestring的大小限制 – string大小限制为4000.感谢这篇文章在这里由user3465996

 select col1 , dbms_xmlgen.convert( -- HTML decode dbms_lob.substr( -- limit size to 4000 chars ltrim( -- remove leading commas REGEXP_REPLACE(REPLACE( REPLACE( XMLAGG( XMLELEMENT("A",col2 ) ORDER BY col2).getClobVal(), '<A>',','), '</A>',''),'([^,]+)(,\1)*(,|$)', '\1\3'), ','), -- remove leading XML commas ltrim 4000,1) -- limit to 4000 string size , 1) -- HTML.decode as col2 from sometable where rn = 1 group by col1; 

一些testing案例 – FYI

 regexp_replace('2,2,2.1,3,3,4,4','([^,]+)(,\1)+', '\1') -> 2.1,3,4 Fail regexp_replace('2 ,2 ,2.1,3 ,3 ,4 ,4 ','([^,]+)(,\1)+', '\1') -> 2 ,2.1,3,4 Success - fixed length items 

项目中包含的项目,例如。 2,21

 regexp_replace('2.1,1','([^,]+)(,\1)+', '\1') -> 2.1 Fail regexp_replace('2 ,2 ,2.1,1 ,3 ,4 ,4 ','(^|,)(.+)(,\2)+', '\1\2') -> 2 ,2.1,1 ,3 ,4 -- success - NEW regex regexp_replace('a,b,b,b,b,c','(^|,)(.+)(,\2)+', '\1\2') -> a,b,b,c fail! 

v3 – 正则expression式感谢伊戈尔! 工作的所有情况。

 select regexp_replace('2,2,2.1,3,3,4,4','([^,]+)(,\1)*(,|$)', '\1\3') , ---> 2,2.1,3,4 works regexp_replace('2.1,1','([^,]+)(,\1)*(,|$)', '\1\3'), --> 2.1,1 works regexp_replace('a,b,b,b,b,c','([^,]+)(,\1)*(,|$)', '\1\3') ---> a,b,c works from dual 

你可以使用无证的wm_concat函数。

 select col1, wm_concat(distinct col2) col2_list from tab1 group by col1; 

这个函数返回clob列,如果你愿意,你可以使用dbms_lob.substr将clob转换为varchar2。

如果意图是将这个转换应用到多列,我已经扩展了a_horse_with_no_name的解决scheme:

 SELECT * FROM (SELECT LISTAGG(GRADE_LEVEL, ',') within group(order by GRADE_LEVEL) "Grade Levels" FROM (select distinct GRADE_LEVEL FROM Students) t) t1, (SELECT LISTAGG(ENROLL_STATUS, ',') within group(order by ENROLL_STATUS) "Enrollment Status" FROM (select distinct ENROLL_STATUS FROM Students) t) t2, (SELECT LISTAGG(GENDER, ',') within group(order by GENDER) "Legal Gender Code" FROM (select distinct GENDER FROM Students) t) t3, (SELECT LISTAGG(CITY, ',') within group(order by CITY) "City" FROM (select distinct CITY FROM Students) t) t4, (SELECT LISTAGG(ENTRYCODE, ',') within group(order by ENTRYCODE) "Entry Code" FROM (select distinct ENTRYCODE FROM Students) t) t5, (SELECT LISTAGG(EXITCODE, ',') within group(order by EXITCODE) "Exit Code" FROM (select distinct EXITCODE FROM Students) t) t6, (SELECT LISTAGG(LUNCHSTATUS, ',') within group(order by LUNCHSTATUS) "Lunch Status" FROM (select distinct LUNCHSTATUS FROM Students) t) t7, (SELECT LISTAGG(ETHNICITY, ',') within group(order by ETHNICITY) "Race Code" FROM (select distinct ETHNICITY FROM Students) t) t8, (SELECT LISTAGG(CLASSOF, ',') within group(order by CLASSOF) "Expected Graduation Year" FROM (select distinct CLASSOF FROM Students) t) t9, (SELECT LISTAGG(TRACK, ',') within group(order by TRACK) "Track Code" FROM (select distinct TRACK FROM Students) t) t10, (SELECT LISTAGG(GRADREQSETID, ',') within group(order by GRADREQSETID) "Graduation ID" FROM (select distinct GRADREQSETID FROM Students) t) t11, (SELECT LISTAGG(ENROLLMENT_SCHOOLID, ',') within group(order by ENROLLMENT_SCHOOLID) "School Key" FROM (select distinct ENROLLMENT_SCHOOLID FROM Students) t) t12, (SELECT LISTAGG(FEDETHNICITY, ',') within group(order by FEDETHNICITY) "Federal Race Code" FROM (select distinct FEDETHNICITY FROM Students) t) t13, (SELECT LISTAGG(SUMMERSCHOOLID, ',') within group(order by SUMMERSCHOOLID) "Summer School Key" FROM (select distinct SUMMERSCHOOLID FROM Students) t) t14, (SELECT LISTAGG(FEDRACEDECLINE, ',') within group(order by FEDRACEDECLINE) "Student Decl to Prov Race Code" FROM (select distinct FEDRACEDECLINE FROM Students) t) t15 

这是Oracle数据库11g企业版版本11.2.0.2.0 – 64位生产。
我无法使用STRAGG,因为没有办法DISTINCT和ORDER。

性能线性增长,这是很好的,因为我添加了所有感兴趣的列。 上面的77K行花了3秒钟。 只有一个汇总,.172秒。 我做的是有一种方法来分隔一个表格中的多个列。

我首先通过对值进行分组来克服这个问题,然后用listagg做另一个聚合。 像这样的东西:

 select a,b,listagg(c,',') within group(order by c) c, avg(d) from (select a,b,c,avg(d) from table group by (a,b,c)) group by (a,b) 

只有一个全表访问,相对容易扩展到更复杂的查询

为了解决string长度问题,您可以使用与listagg相似的XMLAGG ,但是它会返回一个clob。

您可以使用regexp_replaceparsing并获取唯一值,然后使用dbms_lob.substr()将其重新转换为string。 如果你有大量不同的值,你仍然会用这种方式,但是对于很多情况,下面的代码应该可以工作。

您也可以更改您使用的分隔符。 在我的情况下,我想' – '而不是','但是你应该能够代替我的代码中的破折号,如果你想要的话可​​以使用逗号。

 select col1, dbms_lob.substr(ltrim(REGEXP_REPLACE(REPLACE( REPLACE( XMLAGG( XMLELEMENT("A",col2) ORDER BY col2).getClobVal(), '<A>','-'), '</A>',''),'([^-]*)(-\1)+($|-)', '\1\3'),'-'), 4000,1) as platform_mix from table 

如果您希望跨MULTIPLE列的不同值,需要控制sorting顺序,不想使用可能会消失的未公开的函数,并且不希望进行多个全表扫描,则可能会发现此构造有用:

 with test_data as ( select 'A' as col1, 'T_a1' as col2, '123' as col3 from dual union select 'A', 'T_a1', '456' from dual union select 'A', 'T_a1', '789' from dual union select 'A', 'T_a2', '123' from dual union select 'A', 'T_a2', '456' from dual union select 'A', 'T_a2', '111' from dual union select 'A', 'T_a3', '999' from dual union select 'B', 'T_a1', '123' from dual union select 'B', 'T_b1', '740' from dual union select 'B', 'T_b1', '846' from dual ) select col1 , (select listagg(column_value, ',') within group (order by column_value desc) from table(collect_col2)) as col2s , (select listagg(column_value, ',') within group (order by column_value desc) from table(collect_col3)) as col3s from ( select col1 , collect(distinct col2) as collect_col2 , collect(distinct col3) as collect_col3 from test_data group by col1 ); 

如何创build一个专门的function,将使“独特”的部分:

 create or replace function listagg_distinct (t in str_t, sep IN VARCHAR2 DEFAULT ',') return VARCHAR2 as l_rc VARCHAR2(4096) := ''; begin SELECT listagg(val, sep) WITHIN GROUP (ORDER BY 1) INTO l_rc FROM (SELECT DISTINCT column_value val FROM table(t)); RETURN l_rc; end; / 

然后用它来做聚合:

 SELECT col1, listagg_distinct(cast(collect(col_2) as str_t ), ', ') FROM your_table GROUP BY col_1; 

有没有人想过使用PARTITION BY子句? 它在这个查询中为我工作得到应用程序服务和访问列表。

 SELECT DISTINCT T.APP_SVC_ID, LISTAGG(RTRIM(T.ACCESS_MODE), ',') WITHIN GROUP(ORDER BY T.ACCESS_MODE) OVER(PARTITION BY T.APP_SVC_ID) AS ACCESS_MODE FROM APP_SVC_ACCESS_CNTL T GROUP BY T.ACCESS_MODE, T.APP_SVC_ID 

我不得不删除我的NDA的where子句,但你明白了。

我认为这可以帮助 – CASE的列值为NULL,如果它是重复的 – 那么它不附加到LISTAGGstring:

 with test_data as ( select 1 as col1, 2 as col2, 'Smith' as created_by from dual union select 1, 2, 'John' from dual union select 1, 3, 'Ajay' from dual union select 1, 4, 'Ram' from dual union select 1, 5, 'Jack' from dual union select 2, 5, 'Smith' from dual union select 2, 6, 'John' from dual union select 2, 6, 'Ajay' from dual union select 2, 6, 'Ram' from dual union select 2, 7, 'Jack' from dual ) SELECT col1 , listagg(col2 , ',') within group (order by col2 ASC) AS orig_value, listagg(CASE WHEN rwn=1 THEN col2 END , ',') within group (order by col2 ASC) AS distinct_value from ( select row_number() over (partition by col1,col2 order by 1) as rwn, a.* from test_data a ) a GROUP BY col1 

结果是:

 COL1 ORIG DISTINCT 1 2,2,3,4,5 2,3,4,5 2 5,6,6,6,7 5,6,7 

listagg()忽略NULL值,因此在第一步中,您可以使用lag()函数分析前一个logging是否具有相同的值,如果是,则为NULL,否则为“新值”。

 WITH tab AS ( SELECT 1 as col1, 2 as col2, 'Smith' as created_by FROM dual UNION ALL SELECT 1 as col1, 2 as col2, 'John' as created_by FROM dual UNION ALL SELECT 1 as col1, 3 as col2, 'Ajay' as created_by FROM dual UNION ALL SELECT 1 as col1, 4 as col2, 'Ram' as created_by FROM dual UNION ALL SELECT 1 as col1, 5 as col2, 'Jack' as created_by FROM dual ) SELECT col1 , CASE WHEN lag(col2) OVER (ORDER BY col2) = col2 THEN NULL ELSE col2 END as col2_with_nulls , created_by FROM tab; 

结果

  COL1 COL2_WITH_NULLS CREAT ---------- --------------- ----- 1 2 Smith 1 John 1 3 Ajay 1 4 Ram 1 5 Jack 

请注意,第二个2被replace为NULL。 现在你可以用listagg()来包装一个SELECT。

 WITH tab AS ( SELECT 1 as col1, 2 as col2, 'Smith' as created_by FROM dual UNION ALL SELECT 1 as col1, 2 as col2, 'John' as created_by FROM dual UNION ALL SELECT 1 as col1, 3 as col2, 'Ajay' as created_by FROM dual UNION ALL SELECT 1 as col1, 4 as col2, 'Ram' as created_by FROM dual UNION ALL SELECT 1 as col1, 5 as col2, 'Jack' as created_by FROM dual ) SELECT listagg(col2_with_nulls, ',') WITHIN GROUP (ORDER BY col2_with_nulls) col2_list FROM ( SELECT col1 , CASE WHEN lag(col2) OVER (ORDER BY col2) = col2 THEN NULL ELSE col2 END as col2_with_nulls , created_by FROM tab ); 

结果

 COL2_LIST --------- 2,3,4,5 

您也可以在多个列上执行此操作。

 WITH tab AS ( SELECT 1 as col1, 2 as col2, 'Smith' as created_by FROM dual UNION ALL SELECT 1 as col1, 2 as col2, 'John' as created_by FROM dual UNION ALL SELECT 1 as col1, 3 as col2, 'Ajay' as created_by FROM dual UNION ALL SELECT 1 as col1, 4 as col2, 'Ram' as created_by FROM dual UNION ALL SELECT 1 as col1, 5 as col2, 'Jack' as created_by FROM dual ) SELECT listagg(col1_with_nulls, ',') WITHIN GROUP (ORDER BY col1_with_nulls) col1_list , listagg(col2_with_nulls, ',') WITHIN GROUP (ORDER BY col2_with_nulls) col2_list , listagg(created_by, ',') WITHIN GROUP (ORDER BY created_by) created_by_list FROM ( SELECT CASE WHEN lag(col1) OVER (ORDER BY col1) = col1 THEN NULL ELSE col1 END as col1_with_nulls , CASE WHEN lag(col2) OVER (ORDER BY col2) = col2 THEN NULL ELSE col2 END as col2_with_nulls , created_by FROM tab ); 

结果

 COL1_LIST COL2_LIST CREATED_BY_LIST --------- --------- ------------------------- 1 2,3,4,5 Ajay,Jack,John,Ram,Smith 

您可以通过RegExreplace。 这里是一个例子:

 -- Citations Per Year - Cited Publications main query. Includes list of unique associated core project numbers, ordered by core project number. SELECT ptc.pmid AS pmid, ptc.pmc_id, ptc.pub_title AS pubtitle, ptc.author_list AS authorlist, ptc.pub_date AS pubdate, REGEXP_REPLACE( LISTAGG ( ppcc.admin_phs_org_code || TO_CHAR(ppcc.serial_num,'FM000000'), ',') WITHIN GROUP (ORDER BY ppcc.admin_phs_org_code || TO_CHAR(ppcc.serial_num,'FM000000')), '(^|,)(.+)(,\2)+', '\1\2') AS projectNum FROM publication_total_citations ptc JOIN proj_paper_citation_counts ppcc ON ptc.pmid = ppcc.pmid AND ppcc.citation_year = 2013 JOIN user_appls ua ON ppcc.admin_phs_org_code = ua.admin_phs_org_code AND ppcc.serial_num = ua.serial_num AND ua.login_id = 'EVANSF' GROUP BY ptc.pmid, ptc.pmc_id, ptc.pub_title, ptc.author_list, ptc.pub_date ORDER BY pmid; 

还发布在这里: Oracle – 独特的Listagg值

使用像这样创build的listagg_clob函数:

 create or replace package list_const_p is list_sep varchar2(10) := ','; end list_const_p; / sho err create type listagg_clob_t as object( v_liststring varchar2(32767), v_clob clob, v_templob number, static function ODCIAggregateInitialize( sctx IN OUT listagg_clob_t ) return number, member function ODCIAggregateIterate( self IN OUT listagg_clob_t, value IN varchar2 ) return number, member function ODCIAggregateTerminate( self IN OUT listagg_clob_t, returnValue OUT clob, flags IN number ) return number, member function ODCIAggregateMerge( self IN OUT listagg_clob_t, ctx2 IN OUT listagg_clob_t ) return number ); / sho err create or replace type body listagg_clob_t is static function ODCIAggregateInitialize(sctx IN OUT listagg_clob_t) return number is begin sctx := listagg_clob_t('', '', 0); return ODCIConst.Success; end; member function ODCIAggregateIterate( self IN OUT listagg_clob_t, value IN varchar2 ) return number is begin if nvl(lengthb(v_liststring),0) + nvl(lengthb(value),0) <= 4000 then self.v_liststring:=self.v_liststring || value || list_const_p.list_sep; else if self.v_templob = 0 then dbms_lob.createtemporary(self.v_clob, true, dbms_lob.call); self.v_templob := 1; end if; dbms_lob.writeappend(self.v_clob, length(self.v_liststring), v_liststring); self.v_liststring := value || list_const_p.list_sep; end if; return ODCIConst.Success; end; member function ODCIAggregateTerminate( self IN OUT listagg_clob_t, returnValue OUT clob, flags IN number ) return number is begin if self.v_templob != 0 then dbms_lob.writeappend(self.v_clob, length(self.v_liststring), self.v_liststring); dbms_lob.trim(self.v_clob, dbms_lob.getlength(self.v_clob) - 1); else self.v_clob := substr(self.v_liststring, 1, length(self.v_liststring) - 1); end if; returnValue := self.v_clob; return ODCIConst.Success; end; member function ODCIAggregateMerge(self IN OUT listagg_clob_t, ctx2 IN OUT listagg_clob_t) return number is begin if ctx2.v_templob != 0 then if self.v_templob != 0 then dbms_lob.append(self.v_clob, ctx2.v_clob); dbms_lob.freetemporary(ctx2.v_clob); ctx2.v_templob := 0; else self.v_clob := ctx2.v_clob; self.v_templob := 1; ctx2.v_clob := ''; ctx2.v_templob := 0; end if; end if; if nvl(lengthb(self.v_liststring),0) + nvl(lengthb(ctx2.v_liststring),0) <= 4000 then self.v_liststring := self.v_liststring || ctx2.v_liststring; ctx2.v_liststring := ''; else if self.v_templob = 0 then dbms_lob.createtemporary(self.v_clob, true, dbms_lob.call); self.v_templob := 1; end if; dbms_lob.writeappend(self.v_clob, length(self.v_liststring), self.v_liststring); dbms_lob.writeappend(self.v_clob, length(ctx2.v_liststring), ctx2.v_liststring); self.v_liststring := ''; ctx2.v_liststring := ''; end if; return ODCIConst.Success; end; end; / sho err CREATE or replace FUNCTION listagg_clob (input varchar2) RETURN clob PARALLEL_ENABLE AGGREGATE USING listagg_clob_t; / sho err 

我写了一个函数来处理这个使用正则expression式。 in参数是:1)listagg自己调用2)重复的分隔符

 create or replace function distinct_listagg (listagg_in varchar2, delimiter_in varchar2) return varchar2 as hold_result varchar2(4000); begin select rtrim( regexp_replace( (listagg_in) , '([^'||delimiter_in||']*)('|| delimiter_in||'\1)+($|'||delimiter_in||')', '\1\3'), ',') into hold_result from dual; return hold_result; end; 

现在你不必每次重复正则expression式就简单地说:

 select distinct_listagg( listagg(myfield,', ') within group (order by 1), ', ' ) from mytable; 

处理多个listagg的最简单的方法是对每个列使用1个WITH(子查询因子),该列包含来自select distinct的该列的listagg:

  WITH tab AS ( SELECT 1 as col1, 2 as col2, 3 as col3, 'Smith' as created_by FROM dual UNION ALL SELECT 1 as col1, 2 as col2, 3 as col3,'John' as created_by FROM dual UNION ALL SELECT 1 as col1, 3 as col2, 4 as col3,'Ajay' as created_by FROM dual UNION ALL SELECT 1 as col1, 4 as col2, 4 as col3,'Ram' as created_by FROM dual UNION ALL SELECT 1 as col1, 5 as col2, 6 as col3,'Jack' as created_by FROM dual ) , getCol2 AS ( SELECT DISTINCT col1, listagg(col2,',') within group (order by col2) over (partition by col1) AS col2List FROM ( SELECT DISTINCT col1,col2 FROM tab) ) , getCol3 AS ( SELECT DISTINCT col1, listagg(col3,',') within group (order by col3) over (partition by col1) AS col3List FROM ( SELECT DISTINCT col1,col3 FROM tab) ) select col1,col2List,col3List FROM getCol2 JOIN getCol3 using (col1) 

这使:

 col1 col2List col3List 1 2,3,4,5 3,4,6 

如果您不需要特定的连续值顺序,并且分隔符可以是逗号,则可以执行以下操作:

 select col1, stragg(distinct col2) from table group by col1 

我find了一个DISTINCT版本,并得到了这个工作。

 RTRIM(REGEXP_REPLACE( (value, ', ') WITHIN GROUP( ORDER BY value)), '([^ ]+)(, \1)+','\1'),', ')