slow query on multiple table join

From: tao tony <tonytao0505(at)outlook(dot)com>
To: "pgsql-zh-general(at)postgresql(dot)org" <pgsql-zh-general(at)postgresql(dot)org>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: slow query on multiple table join
Date: 2017-05-09 01:37:12
Message-ID: MWHPR13MB13896E7A2C6C62E75EC1F13FAAEF0@MWHPR13MB1389.namprd13.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-zh-general

hi guys,

I met a query performance issue in postgresql 9.6.2 with multiple tables
joined.

there were 2 slow queries,and the reasons were the same:the optimizer
generate a bad explain which using nest loop.

attached is the query and its explain.all tables are small and the
indexes were only created on primary keys .

in query 1 I noticed the explain forecast the nest loop anti join
return 1 row as below,it was the result of (f join p) join pt:

-> Nested Loop (cost=1.95..14838.66 rows=1 width=163)
Join Filter: ((f.shop)::text = (s.uuid)::text)
-> Nested Loop Anti Join (cost=1.95..14743.60 rows=1
width=111)
-> Hash Join (cost=1.53..12067.46 rows=4751
width=115)
Hash Cond: ((p.shop)::text = (f.shop)::text)
-> Seq Scan on shopsku p
(cost=0.00..11483.96 rows=106892 width=106)
Filter: ((state)::text = 'normal'::text)
-> Hash (cost=1.29..1.29 rows=19 width=9)
-> Seq Scan on shopfranchise f
(cost=0.00..1.29 rows=19 width=9)
Filter: (enabled = 1)
-> Index Only Scan using platformsku_pkey on
platformsku pt (cost=0.42..0.55 rows=1 width=36)
Index Cond: (uuid = (p.platformsku)::text)
-> Seq Scan on shop s (cost=0.00..75.58 rows=1558
width=61)
-> Hash (cost=2823.76..2823.76 rows=43376 width=46)
-> Seq Scan on merchantsku m (cost=0.00..2823.76
rows=43376 width=46)

while in analyze explain,it actually returns 57458 row.so higher level
nest loop would get 57458*1558 rows,this cause this query runs for more
than 40 seconds.

-> Nested Loop (cost=1.95..14838.66 rows=1 width=163)
(actual time=0.817..43150.583 rows=57458 loops=1)
Join Filter: ((f.shop)::text = (s.uuid)::text)
Rows Removed by Join Filter: 89462106
-> Nested Loop Anti Join (cost=1.95..14743.60 rows=1
width=111) (actual time=0.060..408.092 rows=57458 loops=1)
-> Hash Join (cost=1.53..12067.46 rows=4751
width=115) (actual time=0.046..174.523 rows=57485 loops=1)
Hash Cond: ((p.shop)::text = (f.shop)::text)
-> Seq Scan on shopsku p
(cost=0.00..11483.96 rows=106892 width=106) (actual time=0.008..107.416
rows=106580 loops=1)
Filter: ((state)::text = 'normal'::text)
Rows Removed by Filter: 429
-> Hash (cost=1.29..1.29 rows=19 width=9)
(actual time=0.026..0.026 rows=20 loops=1)
Buckets: 1024 Batches: 1 Memory
Usage: 9kB
-> Seq Scan on shopfranchise f
(cost=0.00..1.29 rows=19 width=9) (actual time=0.006..0.017 rows=20 loops=1)
Filter: (enabled = 1)
Rows Removed by Filter: 4
-> Index Only Scan using platformsku_pkey on
platformsku pt (cost=0.42..0.55 rows=1 width=36) (actual
time=0.003..0.003 rows=0 loops=57485)
Index Cond: (uuid = (p.platformsku)::text)
Heap Fetches: 0
-> Seq Scan on shop s (cost=0.00..75.58 rows=1558
width=61) (actual time=0.001..0.332 rows=1558 loops=57458)

If I disabled nest loop,ti only use 519 ms.

in query 2 ,the nest loop join also process more than 200m rows,it runs
almost 2 minutes.After disable nest loop,it use hash join,finished in
1.5 sec.

purchase join (shopfranchise f_4 join inventory k)
join gdname
-> Nested Loop Left Join (cost=3972.43..4192.40
rows=1 width=1074) (actual time=268.989..106066.932 rows=45615 loops=1)
Join Filter: (((k.shop)::text =
(purchase.shop)::text) AND ((k.shopsku)::text = (purchase.shopsku)::text))
Rows Removed by Join Filter: 208410367

(shopfranchise f_4 join inventory k) join
gdname
-> Hash Join (cost=3972.43..4165.52 rows=1
width=1112) (actual time=247.088..1754.448 rows=45615 loops=1)
Hash Cond: (((gdname.shop)::text =
(k.shop)::text) AND ((gdname.shopsku)::text = (k.shopsku)::text))
-> CTE Scan on gdname (cost=0.00..140.42
rows=7021 width=1246) (actual time=156.543..1563.121 rows=51153 loops=1)
-> Hash (cost=3925.81..3925.81 rows=3108
width=63) (actual time=90.520..90.520 rows=45622 loops=1)
Buckets: 65536 (originally 4096)
Batches: 1 (originally 1) Memory Usage: 4745kB
shopfranchise f_4 join inventory k
-> Hash Join (cost=1.53..3925.81
rows=3108 width=63) (actual time=0.046..70.173 rows=45622 loops=1)
Hash Cond: ((k.shop)::text =
(f_4.shop)::text)
-> Seq Scan on inventory k
(cost=0.00..3449.47 rows=88747 width=54) (actual time=0.009..22.978
rows=88747 loops=1)
-> Hash (cost=1.29..1.29
rows=19 width=9) (actual time=0.025..0.025 rows=19 loops=1)
Buckets: 1024 Batches:
1 Memory Usage: 9kB
-> Seq Scan on
shopfranchise f_4 (cost=0.00..1.29 rows=19 width=9) (actual
time=0.006..0.017 rows=19 loops=1)
Filter: (enabled = 1)
Rows Removed by
Filter: 4
-> CTE Scan on purchase (cost=0.00..15.36
rows=768 width=196) (actual time=0.001..1.013 rows=4569 loops=45615)

please kindly let me know there's any solution to solve the
problem,thanks a lot!

Attachment Content-Type Size
slow_query.txt text/plain 23.8 KB

Browse pgsql-general by date

  From Date Subject
Next Message Armand Pirvu (home) 2017-05-09 03:31:12 Re: data transformation and replication
Previous Message Adrian Klaver 2017-05-09 01:22:09 Re: Python versus Other Languages using PostgreSQL

Browse pgsql-zh-general by date

  From Date Subject
Next Message Quan Zongliang 2017-09-05 08:28:18 随机字符串函数的C语言实现
Previous Message zhang.wensheng 2017-03-24 13:04:54 Re: [pgsql-zh-general] Re: 答复: 答复: 请教集合A中快速排除集合B,两个大表JOIN的问题