From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Alexander Kuzmenkov <a(dot)kuzmenkov(at)postgrespro(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Removing unneeded self joins |
Date: | 2018-05-16 16:26:48 |
Message-ID: | CA+TgmoYuH_Tkdv7_fxOinJLf7gw6NyYq1Lm=gevq8aoj507g4A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, May 16, 2018 at 12:08 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Alexander Kuzmenkov <a(dot)kuzmenkov(at)postgrespro(dot)ru> writes:
>> There is a join optimization we don't do -- removing inner join of a
>> table with itself on a unique column. Such joins are generated by
>> various ORMs, so from time to time our customers ask us to look into
>> this. Most recently, it was discussed on the list in relation to an
>> article comparing the optimizations that some DBMS make [1].
>
> This is the sort of thing that I always wonder why the customers don't
> ask the ORM to stop generating such damfool queries. Its *expensive*
> for us to clean up after their stupidity; almost certainly, it would
> take far fewer cycles, net, for them to be a bit smarter in the first
> place.
The trouble, of course, is that the customer didn't write the ORM,
likely has no idea how it works, and doesn't want to run a modified
version of it even if they do. If the queries run faster on other
systems than they do on PostgreSQL, we get dinged -- not unjustly.
Also, I'm not sure that I believe that it's always easy to avoid
generating such queries. I mean, this case is trivial so it's easy to
say, well, just rewrite the query. But suppose that I have a fact
table over which I've created two views, each of which performs
various joins between the fact table and various lookup tables. My
queries are such that I normally need the joins in just one of these
two views and not the other to fetch the information I care about.
But every once in a while I need to run a report that involves pulling
every column possible. The obvious solution is to join the views on
the underlying table's primary key, but then you get this problem. Of
course there's a workaround: define a third view that does both sets
of joins-to-lookup-tables. But that starts to feel like you're
handholding the database; surely it's the database's job to optimize
queries, not the user's.
It's been about 10 years since I worked as a web developer, but I do
remember hitting this kind of problem from time to time and I'd really
like to see us do something about it. I wish we could optimize away
inner joins, too, for similar reasons.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Magnus Hagander | 2018-05-16 16:28:43 | Re: Postgres 11 release notes |
Previous Message | Alvaro Herrera | 2018-05-16 16:24:57 | Re: Postgres 11 release notes |