Re: --EXTERNAL--Re: PSQL does not remove obvious useless joins

From: "Sfiligoi, Igor" <Igor(dot)Sfiligoi(at)ga(dot)com>
To: Kevin Grittner <kgrittn(at)gmail(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: --EXTERNAL--Re: PSQL does not remove obvious useless joins
Date: 2016-07-05 16:21:56
Message-ID: 222a8f22542d4331b05104b129e32bb8@ASGEXCPWP06.ga.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

The best that I can do right now is provide the explain of the three variants (see below).

The use of a left join did indeed remove the useless joins, but the selected plan is just terrible.

Thanks,
Igor

-----Original Message-----
From: Kevin Grittner [mailto:kgrittn(at)gmail(dot)com]
Sent: Saturday, July 02, 2016 6:28 AM
To: Sfiligoi, Igor <Igor(dot)Sfiligoi(at)ga(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: --EXTERNAL--Re: [GENERAL] PSQL does not remove obvious useless joins

On Fri, Jul 1, 2016 at 7:15 PM, Sfiligoi, Igor <Igor(dot)Sfiligoi(at)ga(dot)com> wrote:
> OK. Will change our query generation code to not use the view.
> (I have tried the LEFT JOIN approach, but it just does not seem to
> perform.)

> PS: Here are the numbers for the real production query (will not provide details):
> Original query: 300s
> Query on a manually optimized view: 1ms
> Using left joins: 200s

Please show a self-contained case (i.e., one that can be run against an empty database to demonstrate the problem).

---------------------------------------------
Original view, two unused joins
(data_info.true_arrow_id = true_dart.arrow_id && data_info.acl_arrow_id = acl_dart.arrow_id)
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Merge Join (cost=121449.22..11705013.57 rows=890 width=63) (actual time=326791.858..365059.117 rows=1 loops=1)
Merge Cond: (locn_info.rock_person_id = bird_rsrc_physical.rock_person_id)
-> Sort (cost=1.03..1.03 rows=2 width=4) (actual time=0.018..0.020 rows=2 loops=1)
Sort Key: locn_info.rock_person_id
Sort Method: quicksort Memory: 25kB
-> Seq Scan on bird_locn_info locn_info (cost=0.00..1.02 rows=2 width=4) (actual time=0.003..0.003 rows=2 loops=1)
-> Materialize (cost=121448.19..14015467.54 rows=890 width=71) (actual time=326791.835..365059.092 rows=1 loops=1)
-> Merge Join (cost=121448.19..14015465.32 rows=890 width=71) (actual time=326791.833..365059.089 rows=1 loops=1)
Merge Cond: (bird_rsrc_physical.rock_person_id = bird_person_bike.person_id)
-> Nested Loop (cost=121444.20..16786761.58 rows=890 width=67) (actual time=326791.779..365059.033 rows=1 loops=1)
Join Filter: (bird_data_silk.rsrc_id = bird_rsrc_physical.phy_rsrc_id)
-> Nested Loop (cost=0.13..15.68 rows=1 width=8) (actual time=0.025..0.035 rows=1 loops=1)
Join Filter: (bird_rsrc_physical.lock_rock_person_id = lock_info.rock_person_id)
Rows Removed by Join Filter: 1
-> Nested Loop (cost=0.13..14.63 rows=1 width=12) (actual time=0.022..0.031 rows=1 loops=1)
Join Filter: (bird_rsrc_physical.rsrc_class_id = bird_rsrc_class.rsrc_class_id)
Rows Removed by Join Filter: 7
-> Nested Loop (cost=0.13..13.45 rows=1 width=16) (actual time=0.017..0.025 rows=1 loops=1)
Join Filter: ((bird_rsrc_physical.rsrc_type_id)::text = (bird_rsrc_type.rsrc_type_id)::text)
Rows Removed by Join Filter: 11
-> Index Scan using fidx_1_rsrc_physical on bird_rsrc_physical (cost=0.13..12.18 rows=1 width=24) (actual time=0.009..0.013 rows=1 loops=1)
Filter: ((rsrc_name)::text = 'data'::text)
Rows Removed by Filter: 2
-> Seq Scan on bird_rsrc_type (cost=0.00..1.12 rows=12 width=8) (actual time=0.002..0.004 rows=12 loops=1)
-> Seq Scan on bird_rsrc_class (cost=0.00..1.08 rows=8 width=4) (actual time=0.002..0.003 rows=8 loops=1)
-> Seq Scan on bird_locn_info lock_info (cost=0.00..1.02 rows=2 width=4) (actual time=0.001..0.002 rows=2 loops=1)
-> Hash Join (cost=121444.07..16786712.53 rows=2670 width=67) (actual time=326791.750..365058.993 rows=1 loops=1)
Hash Cond: (data_info.true_data_id = bird_data_silk.data_id)
-> Hash Join (cost=118019.92..16144855.88 rows=102144928 width=43) (actual time=1945.934..346470.487 rows=102020209 loops=1)
Hash Cond: (data_info.sheet_id = bird_sheet.sheet_id)
-> Hash Join (cost=118018.36..14740361.56 rows=102144928 width=47) (actual time=1945.911..310945.806 rows=102020209 loops=1)
Hash Cond: ((data_info.data_type_id)::text = (bird_data_type.data_type_id)::text)
-> Hash Join (cost=118011.53..13335861.97 rows=102144928 width=61) (actual time=1945.805..269859.918 rows=102020209 loops=1)
Hash Cond: (data_info.patron_id = patron.person_id)
-> Hash Join (cost=118009.96..11931367.64 rows=102144928 width=65) (actual time=1945.778..239667.755 rows=102020209 loops=1)
Hash Cond: (data_info.data_dog_id = dog.person_id)
-> Hash Join (cost=118008.38..10526873.30 rows=102144928 width=69) (actual time=1945.755..207642.046 rows=102020209 loops=1)
Hash Cond: (data_info.acl_arrow_id = acl_dart.arrow_id)
-> Hash Join (cost=78672.26..8189276.30 rows=102144928 width=72) (actual time=1325.078..153055.303 rows=102020209 loops=1)
Hash Cond: (data_info.true_arrow_id = true_dart.arrow_id)
-> Hash Join (cost=39336.13..5851679.29 rows=102144928 width=79) (actual time=706.719..96335.462 rows=102020209 loops=1)
Hash Cond: (data_info.arrow_id = dart_info.arrow_id)
-> Seq Scan on bird_data_info data_info (cost=0.00..3514082.28 rows=102144928 width=53) (actual time=0.029..19491.161 rows=102020209 loops=1)
-> Hash (cost=26584.39..26584.39 rows=1020139 width=40) (actual time=706.111..706.111 rows=1020208 loops=1)
Buckets: 131072 Batches: 1 Memory Usage: 71880kB
-> Seq Scan on bird_dart_info dart_info (cost=0.00..26584.39 rows=1020139 width=40) (actual time=0.007..255.729 rows=1020208 loops=1)
-> Hash (cost=26584.39..26584.39 rows=1020139 width=7) (actual time=617.809..617.809 rows=1020208 loops=1)
Buckets: 131072 Batches: 1 Memory Usage: 38895kB
-> Seq Scan on bird_dart_info true_dart (cost=0.00..26584.39 rows=1020139 width=7) (actual time=0.007..252.377 rows=1020208 loops=1)
-> Hash (cost=26584.39..26584.39 rows=1020139 width=7) (actual time=620.401..620.401 rows=1020208 loops=1)
Buckets: 131072 Batches: 1 Memory Usage: 38895kB
-> Seq Scan on bird_dart_info acl_dart (cost=0.00..26584.39 rows=1020139 width=7) (actual time=0.011..251.752 rows=1020208 loops=1)
-> Hash (cost=1.29..1.29 rows=23 width=4) (actual time=0.013..0.013 rows=23 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> Seq Scan on bird_person_bike dog (cost=0.00..1.29 rows=23 width=4) (actual time=0.002..0.007 rows=23 loops=1)
Filter: (is_primary = 1)
-> Hash (cost=1.29..1.29 rows=23 width=4) (actual time=0.016..0.016 rows=23 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> Seq Scan on bird_person_bike patron (cost=0.00..1.29 rows=23 width=4) (actual time=0.004..0.012 rows=23 loops=1)
Filter: (is_primary = 1)
-> Hash (cost=4.70..4.70 rows=170 width=20) (actual time=0.098..0.098 rows=170 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on bird_data_type (cost=0.00..4.70 rows=170 width=20) (actual time=0.006..0.038 rows=170 loops=1)
-> Hash (cost=1.25..1.25 rows=25 width=4) (actual time=0.014..0.014 rows=25 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> Seq Scan on bird_sheet (cost=0.00..1.25 rows=25 width=4) (actual time=0.003..0.007 rows=25 loops=1)
-> Hash (cost=3390.77..3390.77 rows=2670 width=38) (actual time=0.033..0.033 rows=1 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> Index Scan using idx_0_data_silk on bird_data_silk (cost=0.57..3390.77 rows=2670 width=38) (actual time=0.030..0.031 rows=1 loops=1)
Index Cond: ((data_path)::text = 'bdd1_vault1'::text)
-> Sort (cost=1.81..1.87 rows=23 width=4) (actual time=0.046..0.050 rows=23 loops=1)
Sort Key: bird_person_bike.person_id
Sort Method: quicksort Memory: 26kB
-> Seq Scan on bird_person_bike (cost=0.00..1.29 rows=23 width=4) (actual time=0.007..0.015 rows=23 loops=1)
Filter: (is_primary = 1)
Planning time: 13.844 ms
Execution time: 365059.322 ms
(77 rows)

Simplified view, useless joines removed by hand
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=13.36..10692.27 rows=890 width=63) (actual time=0.928..0.938 rows=1 loops=1)
Hash Cond: (data_info.sheet_id = bird_sheet.sheet_id)
-> Hash Join (cost=11.79..10678.47 rows=890 width=67) (actual time=0.904..0.914 rows=1 loops=1)
Hash Cond: ((data_info.data_type_id)::text = (bird_data_type.data_type_id)::text)
-> Hash Join (cost=4.97..10659.41 rows=890 width=81) (actual time=0.805..0.815 rows=1 loops=1)
Hash Cond: (data_info.patron_id = patron.person_id)
-> Hash Join (cost=3.39..10645.60 rows=890 width=85) (actual time=0.783..0.793 rows=1 loops=1)
Hash Cond: (data_info.data_dog_id = dog.person_id)
-> Nested Loop (cost=1.82..10631.79 rows=890 width=89) (actual time=0.765..0.775 rows=1 loops=1)
-> Nested Loop (cost=1.39..10226.17 rows=890 width=63) (actual time=0.753..0.761 rows=1 loops=1)
-> Nested Loop (cost=0.82..2586.39 rows=890 width=34) (actual time=0.736..0.743 rows=1 loops=1)
-> Nested Loop (cost=0.26..17.93 rows=1 width=4) (actual time=0.688..0.695 rows=1 loops=1)
Join Filter: (bird_rsrc_physical.lock_rock_person_id = lock_info.rock_person_id)
Rows Removed by Join Filter: 1
-> Nested Loop (cost=0.26..16.89 rows=1 width=8) (actual time=0.685..0.691 rows=1 loops=1)
Join Filter: (bird_rsrc_physical.rock_person_id = locn_info.rock_person_id)
-> Nested Loop (cost=0.13..16.21 rows=1 width=16) (actual time=0.028..0.034 rows=1 loops=1)
Join Filter: (bird_rsrc_physical.rock_person_id = bird_person_bike.person_id)
Rows Removed by Join Filter: 22
-> Nested Loop (cost=0.13..14.63 rows=1 width=12) (actual time=0.017..0.021 rows=1 loops=1)
Join Filter: (bird_rsrc_physical.rsrc_class_id = bird_rsrc_class.rsrc_class_id)
Rows Removed by Join Filter: 7
-> Nested Loop (cost=0.13..13.45 rows=1 width=16) (actual time=0.012..0.016 rows=1 loops=1)
Join Filter: ((bird_rsrc_physical.rsrc_type_id)::text = (bird_rsrc_type.rsrc_type_id)::text)
Rows Removed by Join Filter: 11
-> Index Scan using pkey_rsrc_physical on bird_rsrc_physical (cost=0.13..12.18 rows=1 width=24) (actual time=0.005..0.006 rows=1 loops=1)
Filter: ((rsrc_name)::text = 'data'::text)
Rows Removed by Filter: 2
-> Seq Scan on bird_rsrc_type (cost=0.00..1.12 rows=12 width=8) (actual time=0.002..0.003 rows=12 loops=1)
-> Seq Scan on bird_rsrc_class (cost=0.00..1.08 rows=8 width=4) (actual time=0.002..0.003 rows=8 loops=1)
-> Seq Scan on bird_person_bike (cost=0.00..1.29 rows=23 width=4) (actual time=0.002..0.008 rows=23 loops=1)
Filter: (is_primary = 1)
-> Index Only Scan using pkey_locn_info on bird_locn_info locn_info (cost=0.13..0.67 rows=1 width=4) (actual time=0.651..0.651 rows=1 loops=1)
Index Cond: (rock_person_id = bird_person_bike.person_id)
Heap Fetches: 1
-> Seq Scan on bird_locn_info lock_info (cost=0.00..1.02 rows=2 width=4) (actual time=0.001..0.002 rows=2 loops=1)
-> Index Scan using ukey_2_data_silk on bird_data_silk (cost=0.57..2555.11 rows=1335 width=38) (actual time=0.044..0.044 rows=1 loops=1)
Index Cond: ((rsrc_id = bird_rsrc_physical.phy_rsrc_id) AND ((data_path)::text = 'bdd1_vault1'::text))
-> Index Scan using idx_0_data_info on bird_data_info data_info (cost=0.57..8.57 rows=1 width=43) (actual time=0.015..0.016 rows=1 loops=1)
Index Cond: (true_data_id = bird_data_silk.data_id)
-> Index Scan using pkey_dart_info on bird_dart_info dart_info (cost=0.42..0.45 rows=1 width=40) (actual time=0.010..0.010 rows=1 loops=1)
Index Cond: (arrow_id = data_info.arrow_id)
-> Hash (cost=1.29..1.29 rows=23 width=4) (actual time=0.014..0.014 rows=23 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> Seq Scan on bird_person_bike dog (cost=0.00..1.29 rows=23 width=4) (actual time=0.002..0.004 rows=23 loops=1)
Filter: (is_primary = 1)
-> Hash (cost=1.29..1.29 rows=23 width=4) (actual time=0.017..0.017 rows=23 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> Seq Scan on bird_person_bike patron (cost=0.00..1.29 rows=23 width=4) (actual time=0.005..0.012 rows=23 loops=1)
Filter: (is_primary = 1)
-> Hash (cost=4.70..4.70 rows=170 width=20) (actual time=0.093..0.093 rows=170 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on bird_data_type (cost=0.00..4.70 rows=170 width=20) (actual time=0.005..0.042 rows=170 loops=1)
-> Hash (cost=1.25..1.25 rows=25 width=4) (actual time=0.016..0.016 rows=25 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> Seq Scan on bird_sheet (cost=0.00..1.25 rows=25 width=4) (actual time=0.003..0.006 rows=25 loops=1)
Planning time: 9.906 ms
Execution time: 1.129 ms
(58 rows)

Like th original view, but with the two unused joins changed to left join
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Merge Join (cost=42776.96..8431442.43 rows=890 width=63) (actual time=57764.609..209119.185 rows=1 loops=1)
Merge Cond: (locn_info.rock_person_id = bird_rsrc_physical.rock_person_id)
-> Sort (cost=1.03..1.03 rows=2 width=4) (actual time=0.013..0.014 rows=2 loops=1)
Sort Key: locn_info.rock_person_id
Sort Method: quicksort Memory: 25kB
-> Seq Scan on bird_locn_info locn_info (cost=0.00..1.02 rows=2 width=4) (actual time=0.003..0.003 rows=2 loops=1)
-> Materialize (cost=42775.93..10104638.94 rows=890 width=71) (actual time=57764.591..209119.164 rows=1 loops=1)
-> Merge Join (cost=42775.93..10104636.72 rows=890 width=71) (actual time=57764.587..209119.160 rows=1 loops=1)
Merge Cond: (bird_rsrc_physical.rock_person_id = bird_person_bike.person_id)
-> Nested Loop (cost=42771.94..12111567.57 rows=890 width=67) (actual time=57764.536..209119.107 rows=1 loops=1)
Join Filter: (bird_data_silk.rsrc_id = bird_rsrc_physical.phy_rsrc_id)
-> Nested Loop (cost=0.13..15.68 rows=1 width=8) (actual time=0.025..0.035 rows=1 loops=1)
Join Filter: (bird_rsrc_physical.lock_rock_person_id = lock_info.rock_person_id)
Rows Removed by Join Filter: 1
-> Nested Loop (cost=0.13..14.63 rows=1 width=12) (actual time=0.022..0.030 rows=1 loops=1)
Join Filter: (bird_rsrc_physical.rsrc_class_id = bird_rsrc_class.rsrc_class_id)
Rows Removed by Join Filter: 7
-> Nested Loop (cost=0.13..13.45 rows=1 width=16) (actual time=0.017..0.025 rows=1 loops=1)
Join Filter: ((bird_rsrc_physical.rsrc_type_id)::text = (bird_rsrc_type.rsrc_type_id)::text)
Rows Removed by Join Filter: 11
-> Index Scan using fidx_1_rsrc_physical on bird_rsrc_physical (cost=0.13..12.18 rows=1 width=24) (actual time=0.009..0.012 rows=1 loops=1)
Filter: ((rsrc_name)::text = 'data'::text)
Rows Removed by Filter: 2
-> Seq Scan on bird_rsrc_type (cost=0.00..1.12 rows=12 width=8) (actual time=0.002..0.002 rows=12 loops=1)
-> Seq Scan on bird_rsrc_class (cost=0.00..1.08 rows=8 width=4) (actual time=0.002..0.003 rows=8 loops=1)
-> Seq Scan on bird_locn_info lock_info (cost=0.00..1.02 rows=2 width=4) (actual time=0.001..0.001 rows=2 loops=1)
-> Hash Join (cost=42771.81..12111518.51 rows=2670 width=67) (actual time=57764.506..209119.066 rows=1 loops=1)
Hash Cond: (data_info.true_data_id = bird_data_silk.data_id)
-> Hash Join (cost=39347.67..11469661.87 rows=102144928 width=43) (actual time=719.142..193957.314 rows=102020209 loops=1)
Hash Cond: (data_info.arrow_id = dart_info.arrow_id)
-> Hash Join (cost=11.54..9132064.86 rows=102144928 width=17) (actual time=0.172..143616.386 rows=102020209 loops=1)
Hash Cond: (data_info.sheet_id = bird_sheet.sheet_id)
-> Hash Join (cost=9.98..7727570.54 rows=102144928 width=21) (actual time=0.154..115782.879 rows=102020209 loops=1)
Hash Cond: ((data_info.data_type_id)::text = (bird_data_type.data_type_id)::text)
-> Hash Join (cost=3.15..6323070.95 rows=102144928 width=35) (actual time=0.052..81218.795 rows=102020209 loops=1)
Hash Cond: (data_info.patron_id = patron.person_id)
-> Hash Join (cost=1.58..4918576.62 rows=102144928 width=39) (actual time=0.030..54639.659 rows=102020209 loops=1)
Hash Cond: (data_info.data_dog_id = dog.person_id)
-> Seq Scan on bird_data_info data_info (cost=0.00..3514082.28 rows=102144928 width=53) (actual time=0.010..18264.383 rows=102020209 loops=1)
-> Hash (cost=1.29..1.29 rows=23 width=4) (actual time=0.015..0.015 rows=23 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> Seq Scan on bird_person_bike dog (cost=0.00..1.29 rows=23 width=4) (actual time=0.002..0.010 rows=23 loops=1)
Filter: (is_primary = 1)
-> Hash (cost=1.29..1.29 rows=23 width=4) (actual time=0.017..0.017 rows=23 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> Seq Scan on bird_person_bike patron (cost=0.00..1.29 rows=23 width=4) (actual time=0.004..0.013 rows=23 loops=1)
Filter: (is_primary = 1)
-> Hash (cost=4.70..4.70 rows=170 width=20) (actual time=0.096..0.096 rows=170 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on bird_data_type (cost=0.00..4.70 rows=170 width=20) (actual time=0.005..0.041 rows=170 loops=1)
-> Hash (cost=1.25..1.25 rows=25 width=4) (actual time=0.013..0.013 rows=25 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> Seq Scan on bird_sheet (cost=0.00..1.25 rows=25 width=4) (actual time=0.003..0.007 rows=25 loops=1)
-> Hash (cost=26584.39..26584.39 rows=1020139 width=40) (actual time=718.449..718.449 rows=1020208 loops=1)
Buckets: 131072 Batches: 1 Memory Usage: 71880kB
-> Seq Scan on bird_dart_info dart_info (cost=0.00..26584.39 rows=1020139 width=40) (actual time=0.006..265.742 rows=1020208 loops=1)
-> Hash (cost=3390.77..3390.77 rows=2670 width=38) (actual time=0.041..0.041 rows=1 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 1kB
-> Index Scan using idx_0_data_silk on bird_data_silk (cost=0.57..3390.77 rows=2670 width=38) (actual time=0.038..0.038 rows=1 loops=1)
Index Cond: ((data_path)::text = 'bdd1_vault1'::text)
-> Sort (cost=1.81..1.87 rows=23 width=4) (actual time=0.045..0.046 rows=23 loops=1)
Sort Key: bird_person_bike.person_id
Sort Method: quicksort Memory: 26kB
-> Seq Scan on bird_person_bike (cost=0.00..1.29 rows=23 width=4) (actual time=0.005..0.007 rows=23 loops=1)
Filter: (is_primary = 1)
Planning time: 10.115 ms
Execution time: 209119.417 ms
(67 rows)

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2016-07-05 16:47:38 Re: Materialized view not created with import
Previous Message Tom Lane 2016-07-05 15:24:17 Re: Avoid deadlocks on alter table