MYSQL sum()为不同的行
我正在寻找在我的SQL查询中使用sum()的帮助:
SELECT links.id, count(DISTINCT stats.id) as clicks, count(DISTINCT conversions.id) as conversions, sum(conversions.value) as conversion_value FROM links LEFT OUTER JOIN stats ON links.id = stats.parent_id LEFT OUTER JOIN conversions ON links.id = conversions.link_id GROUP BY links.id ORDER BY links.created desc;
我使用DISTINCT
因为我正在做“group by”,这可以确保同一行不会超过一次。
问题是,SUM(conversions.value)多次计算每行的“值”(由于组)
我基本上想要为每个DISTINCT conversions.id做SUM(conversions.value)
。
那可能吗?
我可能是错的,但从我的理解
- conversions.id是表格转换的主键
- stats.id是你的表统计的主键
因此,对于每个conversions.id,您最多只有一个links.id受到影响。
你的要求有点像做2套笛卡儿的产品:
[clicks] SELECT * FROM links LEFT OUTER JOIN stats ON links.id = stats.parent_id [conversions] SELECT * FROM links LEFT OUTER JOIN conversions ON links.id = conversions.link_id
并为每个链接,你得到sizeof([点击])×sizeof([转换])行
正如您所logging的,您的请求中的唯一转化次数可以通过a
count(distinct conversions.id) = sizeof([conversions])
这个独特的设法去除了笛卡尔产品中的所有[点击]行
但显然
sum(conversions.value) = sum([conversions].value) * sizeof([clicks])
你的情况,因为
count(*) = sizeof([clicks]) x sizeof([conversions]) count(*) = sizeof([clicks]) x count(distinct conversions.id)
你有
sizeof([clicks]) = count(*)/count(distinct conversions.id)
所以我会testing你的请求
SELECT links.id, count(DISTINCT stats.id) as clicks, count(DISTINCT conversions.id) as conversions, sum(conversions.value)*count(DISTINCT conversions.id)/count(*) as conversion_value FROM links LEFT OUTER JOIN stats ON links.id = stats.parent_id LEFT OUTER JOIN conversions ON links.id = conversions.link_id GROUP BY links.id ORDER BY links.created desc;
保持我张贴! 杰罗姆
有关您为什么看到不正确的数字的解释, 请阅读 。
我认为杰罗姆能够处理导致错误的原因。 布赖森的查询将工作,虽然在SELECT中的子查询可能是低效的。
使用以下查询:
SELECT links.id , ( SELECT COUNT(*) FROM stats WHERE links.id = stats.parent_id ) AS clicks , conversions.conversions , conversions.conversion_value FROM links LEFT JOIN ( SELECT link_id , COUNT(id) AS conversions , SUM(conversions.value) AS conversion_value FROM conversions GROUP BY link_id ) AS conversions ON links.id = conversions.link_id ORDER BY links.created DESC
杰罗姆斯的解决scheme其实是错误的,会产生不正确的结果!
sum(conversions.value)*count(DISTINCT conversions.id)/count(*) as conversion_value
让我们假设下面的表格
conversions id value 1 5 1 5 1 5 2 2 3 1
不同的ID的正确价值总和将是8.杰罗姆的公式产生:
sum(conversions.value) = 18 count(distinct conversions.id) = 3 count(*) = 5 18*3/5 = 9.6 != 8
我使用子查询来做到这一点。 它消除了分组的问题。 所以这个查询会是这样的:
SELECT COUNT(DISTINCT conversions.id) ... (SELECT SUM(conversions.value) FROM ....) AS Vals
怎么样这样的事情:
select l.id, count(s.id) clicks, count(c.id) clicks, sum(c.value) conversion_value from (SELECT l.id id, l.created created, s.id clicks, c.id conversions, max(c.value) conversion_value FROM links l LEFT JOIN stats s ON l.id = s.parent_id LEFT JOIN conversions c ON l.id = c.link_id GROUP BY l.id, l.created, s.id, c.id) t order by t.created
这将做的伎俩,只是将分数与重复的对话ID计数。
SELECT a.id, a.clicks, SUM(a.conversion_value/a.conversions) AS conversion_value, a.conversions FROM (SELECT links.id, COUNT(DISTINCT stats.id) AS clicks, COUNT(conversions.id) AS conversions, SUM(conversions.value) AS conversion_value FROM links LEFT OUTER JOIN stats ON links.id = stats.parent_id LEFT OUTER JOIN conversions ON links.id = conversions.link_id GROUP BY conversions.id,links.id ORDER BY links.created DESC) AS a GROUP BY a.id