在SQL中,你如何在范围内“分组”?
假设我有一个数字列表(让我们称之为“分数”)。
我想生成一个计数表,显示每个范围内出现了多less次分数。
例如:
得分范围| 发生次数 ------------------------------------- 0-9 | 11 10-19 | 14 20-29 | 3 ... | ...
在这个例子中,有11行,分数在0到9之间,14行在10到19之间,而3行在20到29之间。
有没有简单的方法来设置? 你有什么build议?
在SQLServer 2000上,最高的投票答案都是正确的。也许他们使用的是不同的版本。
这是SQLServer 2000上它们两个的正确版本。
select t.range as [score range], count(*) as [number of occurences] from ( select case when score between 0 and 9 then ' 0- 9' when score between 10 and 19 then '10-19' else '20-99' end as range from scores) t group by t.range
要么
select t.range as [score range], count(*) as [number of occurences] from ( select user_id, case when score >= 0 and score< 10 then '0-9' when score >= 10 and score< 20 then '10-19' else '20-99' end as range from scores) t group by t.range
我在这里看到答案在SQL Server的语法中不起作用。 我会用:
select t.range as [score range], count(*) as [number of occurences] from ( select case when score between 0 and 9 then ' 0-9 ' when score between 10 and 19 then '10-19' when score between 20 and 29 then '20-29' ... else '90-99' end as range from scores) t group by t.range
编辑:见评论
另一种方法是将范围存储在表中,而不是将其embedded到查询中。 你最终会得到一个表,称之为Ranges,如下所示:
LowerLimit UpperLimit Range 0 9 '0-9' 10 19 '10-19' 20 29 '20-29' 30 39 '30-39'
和一个看起来像这样的查询:
Select Range as [Score Range], Count(*) as [Number of Occurences] from Ranges r inner join Scores s on s.Score between r.LowerLimit and r.UpperLimit group by Range
这确实意味着build立一个表格,但是当期望的范围改变时,这将很容易维护。 无需更改代码!
在postgres(其中||
是string连接运算符):
select (score/10)*10 || '-' || (score/10)*10+9 as scorerange, count(*) from scores group by score/10 order by 1
得到:
scorerange | count ------------+------- 0-9 | 11 10-19 | 14 20-29 | 3 30-39 | 2
詹姆斯·柯伦的回答在我看来是最简洁的,但是输出结果并不正确。 对于SQL Server最简单的说法如下:
SELECT [score range] = CAST((Score/10)*10 AS VARCHAR) + ' - ' + CAST((Score/10)*10+9 AS VARCHAR), [number of occurrences] = COUNT(*) FROM #Scores GROUP BY Score/10 ORDER BY Score/10
这假设我用来testing它的#Scores临时表,我只填充了100行,随机数在0和99之间。
create table scores ( user_id int, score int ) select t.range as [score range], count(*) as [number of occurences] from ( select user_id, case when score >= 0 and score < 10 then '0-9' case when score >= 10 and score < 20 then '10-19' ... else '90-99' as range from scores) t group by t.range
select cast(score/10 as varchar) + '-' + cast(score/10+9 as varchar), count(*) from scores group by score/10
我会做这个稍微不同的事情,以便它不需要定义每个案例的规模:
select t.range as [score range], count(*) as [number of occurences] from ( select FLOOR(score/10) as range from scores) t group by t.range
没有testing,但你明白了…
这将允许您不必指定范围,并且应该是SQL服务器不可知的。 mathFTW!
SELECT CONCAT(range,'-',range+9), COUNT(range) FROM ( SELECT score - (score % 10) as range FROM scores )
declare @RangeWidth int set @RangeWidth = 10 select Floor(Score/@RangeWidth) as LowerBound, Floor(Score/@RangeWidth)+@RangeWidth as UpperBound, Count(*) From ScoreTable group by Floor(Score/@RangeWidth)
因为在( Range
)上sorting的列是一个string,所以使用string/字sorting而不是数字sorting。
只要string有零来填充数字长度,sorting仍应该在语义上是正确的:
SELECT t.range AS ScoreRange, COUNT(*) AS NumberOfOccurrences FROM (SELECT CASE WHEN score BETWEEN 0 AND 9 THEN '00-09' WHEN score BETWEEN 10 AND 19 THEN '10-19' ELSE '20-99' END AS Range FROM Scores) t GROUP BY t.Range
如果范围是混合的,只需添加一个额外的零:
SELECT t.range AS ScoreRange, COUNT(*) AS NumberOfOccurrences FROM (SELECT CASE WHEN score BETWEEN 0 AND 9 THEN '000-009' WHEN score BETWEEN 10 AND 19 THEN '010-019' WHEN score BETWEEN 20 AND 99 THEN '020-099' ELSE '100-999' END AS Range FROM Scores) t GROUP BY t.Range
尝试
SELECT (str(range) + "-" + str(range + 9) ) AS [Score range], COUNT(score) AS [number of occurances] FROM (SELECT score, int(score / 10 ) * 10 AS range FROM scoredata ) GROUP BY range;
select t.blah as [score range], count(*) as [number of occurences] from ( select case when score between 0 and 9 then ' 0-9 ' when score between 10 and 19 then '10-19' when score between 20 and 29 then '20-29' ... else '90-99' end as blah from scores) t group by t.blah
如果你使用的是MySQL,确保你使用“range”以外的字,否则运行上面的例子会出错。
也许你在问关于保持这种事情…
当然,您将为查询调用全表扫描,如果包含需要计分的分数(聚合)的表很大,则可能需要更好的执行解决scheme,您可以创build辅助表并使用规则,例如on insert
– 你可能会看看它。
不是所有的RDBMS引擎都有规则!