Re: PSQL does not remove obvious useless joins

From: Kevin Grittner <kgrittn(at)gmail(dot)com>
To: "Sfiligoi, Igor" <Igor(dot)Sfiligoi(at)ga(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: PSQL does not remove obvious useless joins
Date: 2016-07-01 20:28:37
Message-ID: CACjxUsMcOqEw-JAV0jugtVNMxcKVj+q5xAqfXn6KF1pzR0oAmg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Fri, Jul 1, 2016 at 12:17 PM, Sfiligoi, Igor <Igor(dot)Sfiligoi(at)ga(dot)com> wrote:

> We have a view that is very generic, and we noticed that PostgreSQL is not
> very good at removing useless joins, which makes our queries very slow.
>
> We could change our code to avoid the view and write ad-hoc queries to the
> underlying tables, but would prefer not to, if there is a way around it.

If it did not do the joins it could not provide the information you
are actually asking to see. Of course, there is a very good chance
that what you are asking to see is not what you *want* to see.

test=# create table a (id int primary key, name varchar(128));
CREATE TABLE
test=# create table b (id int primary key, name varchar(128));
CREATE TABLE
test=# create table c (id int primary key,
test(# a_id int references a(id),
test(# b1_id int references b(id),
test(# b2_id int references b(id),
test(# b3_id int references b(id));
CREATE TABLE
test=#
test=# create view v_broken as
test-# select
test-# c.id, c.a_id, c.b1_id, c.b2_id, c.b3_id, a.name a_name,
test-# b1.name b1_name, b2.name b2_name, b3.name b3_name
test-# from c, a, b b1, b b2, b b3
test-# where c.a_id=a.id and c.b1_id=b1.id and c.b2_id=b2.id and
c.b3_id=b3.id;
CREATE VIEW
test=#
test=# create view v as
test-# select
test-# c.id, c.a_id, c.b1_id, c.b2_id, c.b3_id, a.name a_name,
test-# b1.name b1_name, b2.name b2_name, b3.name b3_name
test-# from c
test-# left join a on a.id = c.a_id
test-# left join b b1 on b1.id = c.b1_id
test-# left join b b2 on b2.id = c.b2_id
test-# left join b b3 on b3.id = c.b3_id;
CREATE VIEW
test=#
test=# insert into a values (1, 'a1');
INSERT 0 1
test=# insert into b values (1, 'b1'), (2, 'b2'), (3, 'b3');
INSERT 0 3
test=# insert into c values (1, 1, 1, 2, 3), (2, 1, 1, 2, null);
INSERT 0 2
test=#
test=# select id, b1_name from v_broken;
id | b1_name
----+---------
1 | b1
(1 row)

test=# explain analyze select id, b1_name from v_broken;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=63.40..183.90 rows=1700 width=278) (actual
time=0.049..0.052 rows=1 loops=1)
Hash Cond: (c.b3_id = b3.id)
-> Hash Join (cost=47.55..144.68 rows=1700 width=282) (actual
time=0.030..0.033 rows=2 loops=1)
Hash Cond: (c.b2_id = b2.id)
-> Hash Join (cost=31.70..105.45 rows=1700 width=286)
(actual time=0.018..0.020 rows=2 loops=1)
Hash Cond: (c.b1_id = b1.id)
-> Hash Join (cost=15.85..66.22 rows=1700 width=16)
(actual time=0.010..0.012 rows=2 loops=1)
Hash Cond: (c.a_id = a.id)
-> Seq Scan on c (cost=0.00..27.00 rows=1700
width=20) (actual time=0.001..0.001 rows=2 loops=1)
-> Hash (cost=12.60..12.60 rows=260 width=4)
(actual time=0.003..0.003 rows=1 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on a (cost=0.00..12.60
rows=260 width=4) (actual time=0.002..0.003 rows=1 loops=1)
-> Hash (cost=12.60..12.60 rows=260 width=278)
(actual time=0.005..0.005 rows=3 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on b b1 (cost=0.00..12.60 rows=260
width=278) (actual time=0.002..0.003 rows=3 loops=1)
-> Hash (cost=12.60..12.60 rows=260 width=4) (actual
time=0.006..0.006 rows=3 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on b b2 (cost=0.00..12.60 rows=260
width=4) (actual time=0.004..0.004 rows=3 loops=1)
-> Hash (cost=12.60..12.60 rows=260 width=4) (actual
time=0.013..0.013 rows=3 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on b b3 (cost=0.00..12.60 rows=260 width=4)
(actual time=0.008..0.009 rows=3 loops=1)
Planning time: 0.729 ms
Execution time: 0.153 ms
(23 rows)

test=# select id, b1_name from v;
id | b1_name
----+---------
1 | b1
2 | b1
(2 rows)

test=# explain analyze select id, b1_name from v;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------
Hash Left Join (cost=15.85..66.22 rows=1700 width=278) (actual
time=0.017..0.018 rows=2 loops=1)
Hash Cond: (c.b1_id = b1.id)
-> Seq Scan on c (cost=0.00..27.00 rows=1700 width=20) (actual
time=0.005..0.005 rows=2 loops=1)
-> Hash (cost=12.60..12.60 rows=260 width=278) (actual
time=0.006..0.006 rows=3 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on b b1 (cost=0.00..12.60 rows=260 width=278)
(actual time=0.002..0.003 rows=3 loops=1)
Planning time: 0.177 ms
Execution time: 0.044 ms
(8 rows)

Note the difference in results using inner joins versus left outer joins.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Sfiligoi, Igor 2016-07-01 20:28:48 Re: --EXTERNAL--Re: PSQL does not remove obvious useless joins
Previous Message Thomas Kellerer 2016-07-01 20:12:46 Re: 9.6 beta2 win-x64 download links still point to beta1