Re: Performance improvement for joins where outer side is unique

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Performance improvement for joins where outer side is unique
Date: 2017-01-27 18:44:28
Message-ID: CA+TgmoZ7+Xqaemra=6AG4XhnFTDm+X-BV7bAhCyJjotWfpnRJA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jan 27, 2017 at 1:39 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Fri, Jan 27, 2017 at 11:44 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> I'm afraid though that we may have to do something about the
>>> irrelevant-joinquals issue in order for this to be of much real-world
>>> use for inner joins.
>
>> Maybe, but it's certainly not the case that all inner joins are highly
>> selective. There are plenty of inner joins in real-world applications
>> where the join product is 10% or 20% or 50% of the size of the larger
>> input.
>
> Um ... what's that got to do with the point at hand?

I thought it was directly relevant, but maybe I'm confused. Further
up in that email, you wrote: "If there's a unique index on t2.x, we'll
be able to mark the join inner-unique. However, short-circuiting
would only occur after finding a row that passes both joinquals. If
the y condition is true for only a few rows, this would pretty nearly
disable the optimization. Ideally we would short-circuit after
testing the x condition only, but there's no provision for that."

So I assumed from that that the issue was that you'd have to wait for
the first time the irrelevant-joinqual got satisfied before the
optimization kicked in. But, if the join is emitting lots of rows,
that'll happen pretty quickly. I mean, if the join emits even as many
20 rows, the time after the first one is, all things being equal, 95%
of the runtime of the join. There could certainly be bad cases where
it takes a long time to produce the first row, but I wouldn't say
that's a particularly common thing.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-01-27 18:47:47 Re: pg_hba_file_settings view patch
Previous Message Tom Lane 2017-01-27 18:39:17 Re: Performance improvement for joins where outer side is unique