From: | "Sfiligoi, Igor" <Igor(dot)Sfiligoi(at)ga(dot)com> |
---|---|
To: | Merlin Moncure <mmoncure(at)gmail(dot)com> |
Cc: | "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: --EXTERNAL--Re: PSQL does not remove obvious useless joins |
Date: | 2016-07-01 20:28:48 |
Message-ID: | c66796bcbeaa4eef87948ad9e9d5101a@ASGEXCPWP06.ga.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Sorry... the example was incomplete.
All the fields are defined as not-null.
So it is guaranteed to always match the join.
And PostgreSQL release notes claim that PGSQL can do at least partial join removal:
https://wiki.postgresql.org/wiki/What's_new_in_PostgreSQL_9.0#Join_Removal
I was hoping this use case would fit in.
Any suggestions?
Igor
-----Original Message-----
From: Merlin Moncure [mailto:mmoncure(at)gmail(dot)com]
Sent: Friday, July 01, 2016 12:42 PM
To: Sfiligoi, Igor <Igor(dot)Sfiligoi(at)ga(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: --EXTERNAL--Re: [GENERAL] PSQL does not remove obvious useless joins
On Fri, Jul 1, 2016 at 12:17 PM, Sfiligoi, Igor <Igor(dot)Sfiligoi(at)ga(dot)com> wrote:
> Hello.
>
> We have a view that is very generic, and we noticed that PostgreSQL is
> not very good at removing useless joins, which makes our queries very slow.
>
> We could change our code to avoid the view and write ad-hoc queries to
> the underlying tables, but would prefer not to, if there is a way around it.
>
> (BTW: We are currently using psql 9.4)
>
> Here is a simplified implementation:
>
> # create table a (id int primary key, name varchar(128));
>
> # create table b (id int primary key, name varchar(128));
>
> # create table c (id int primary key, a_id int references a(id), b1_id
> int references b(id), b2_id int references b(id), b3_id int references
> b(id));
>
> # create view v as select c.id, c.a_id, c.b1_id, c.b2_id , c.b3_id,
> a.name a_name, b1.name b1_name, b2.name b2_name, b3.name b3_name from
> c, a, b b1, b b2, b b3 where c.a_id=a.id and c.b1_id=b1.id and
> c.b2_id=b2.id and c.b3_id=b3.id;
>
> When I try to get just info from tables c and b1:
>
> # select id, b1_name from v
>
> it still does all the joins (see below).
>
> I would expect just one join (due to the request of columns from the
> two tables),
>
> since all joins are on foreign constrains referencing primary keys,
>
> there are no filters on the other tables, so it is guaranteed that the
> useless joins will always return exactly one answer.
I think what you're asking for is a lot more complex than it sounds, and incorrect. The precise state of the data influences how many records come back (in this case, either 1 or 0), for example if b3_id is null you get zero rows. More to the point, you *instructed* the server to make the join. There are strategies to make joins 'optional' at run time with respect to a query, but they are more complicated than simply withdrawing columns from the select list.
Stepping back a bit, the query needs to be planned before peeking at the data in the tables. The planner is able to make assumptions against a statistical picture of the data but shouldn't be expected to actually inspect precise result data in order to generate a better plan.
merlin
From | Date | Subject | |
---|---|---|---|
Next Message | Sfiligoi, Igor | 2016-07-01 20:33:28 | Re: --EXTERNAL--Re: PSQL does not remove obvious useless joins |
Previous Message | Kevin Grittner | 2016-07-01 20:28:37 | Re: PSQL does not remove obvious useless joins |