替代SQL count子查询

前端之家收集整理的这篇文章主要介绍了替代SQL count子查询前端之家小编觉得挺不错的,现在分享给大家,也给大家做个参考。
我有以下查询: @H_404_2@SELECT DISTINCT e.id,folder,subject,in_reply_to,message_id,"references",e.updated_at,( select count(*) from emails where ( select "references"[1] from emails where message_id = e.message_id ) = ANY ("references") or message_id = ( select "references"[1] from emails where message_id = e.message_id ) ) FROM "emails" e INNER JOIN "email_participants" ON ("email_participants"."email_id" = e."id") WHERE (("user_id" = 220) AND ("folder" = 'INBox')) ORDER BY e."updated_at" DESC LIMIT 10 OFFSET 0;

Here是上述查询explain analyze输出.

查询执行正常,直到我添加下面的count子查询

@H_404_2@( select count(*) from emails where ( select "references"[1] from emails where message_id = e.message_id ) = ANY ("references") or message_id = ( select "references"[1] from emails where message_id = e.message_id ) )

事实上,我已经尝试过更简单的子查询,似乎是聚合函数本身需要时间.

那么我可以将count子查询附加到每个结果上吗?我应该在初始查询运行后更新结果吗?

这是一个pastebin,它将创建表并在最后运行性能不佳的查询显示输出应该是什么.

解决方法

扩展Paul Guyot的答案,您可以将子查询移动到派生表中,该表应该执行得更快,因为它在一次扫描(加一个连接)中获取消息计数,而不是每行一次扫描. @H_404_2@SELECT DISTINCT e.id,e.folder,e.subject,e.message_id,e."references",t1.message_count FROM "emails" e INNER JOIN "email_participants" ON ("email_participants"."email_id" = e."id") INNER JOIN ( SELECT COUNT(e2.id) message_count,e.message_id FROM emails e LEFT JOIN emails e2 ON (ARRAY[e."references"[1]] <@ e2."references" OR e2.message_id = e."references"[1]) GROUP BY e.message_id ) t1 ON t1.message_id = e.message_id WHERE (("user_id" = 220) AND ("folder" = 'INBox')) ORDER BY e."updated_at" DESC LIMIT 10 OFFSET 0;

使用pastebin数据的小提琴 – http://www.sqlfiddle.com/#!15/c6298/7

下面是postgres生成查询计划,用于通过加入派生表来获取相关子查询中的计数与获取计数.我使用了自己的一张桌子,但我认为结果应该是相似的.

相关子查询

@H_404_2@"Limit (cost=0.00..1123641.81 rows=1000 width=8) (actual time=11.237..5395.237 rows=1000 loops=1)" " -> Seq Scan on visit v (cost=0.00..44996236.24 rows=40045 width=8) (actual time=11.236..5395.014 rows=1000 loops=1)" " SubPlan 1" " -> Aggregate (cost=1123.61..1123.62 rows=1 width=0) (actual time=5.393..5.393 rows=1 loops=1000)" " -> Seq Scan on visit v2 (cost=0.00..1073.56 rows=20018 width=0) (actual time=0.002..4.280 rows=21393 loops=1000)" " Filter: (company_id = v.company_id)" " Rows Removed by Filter: 18653" "Total runtime: 5395.369 ms"

加入派生表

@H_404_2@"Limit (cost=1173.74..1211.81 rows=1000 width=12) (actual time=21.819..22.629 rows=1000 loops=1)" " -> Hash Join (cost=1173.74..2697.72 rows=40036 width=12) (actual time=21.817..22.465 rows=1000 loops=1)" " Hash Cond: (v.company_id = visit.company_id)" " -> Seq Scan on visit v (cost=0.00..973.45 rows=40045 width=8) (actual time=0.010..0.198 rows=1000 loops=1)" " -> Hash (cost=1173.71..1173.71 rows=2 width=12) (actual time=21.787..21.787 rows=2 loops=1)" " Buckets: 1024 Batches: 1 Memory Usage: 1kB" " -> HashAggregate (cost=1173.67..1173.69 rows=2 width=4) (actual time=21.783..21.784 rows=3 loops=1)" " -> Seq Scan on visit (cost=0.00..973.45 rows=40045 width=4) (actual time=0.003..6.695 rows=40046 loops=1)" "Total runtime: 22.806 ms"

猜你在找的MsSQL相关文章