Quick Links

Re: Perfomance of IN-clause with many elements and possible solutions

From:	"David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Dmitry Lazurkin <dilaz03(at)gmail(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject:	Re: Perfomance of IN-clause with many elements and possible solutions
Date:	2017-07-25 03:03:24
Message-ID:	CAKFQuwYCPBfqrPt6d97HXC8mGVr04ySAXsJc9BhVceVb+RWXGw@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

On Mon, Jul 24, 2017 at 7:58 PM, David G. Johnston <
david(dot)g(dot)johnston(at)gmail(dot)com> wrote:

> On Mon, Jul 24, 2017 at 3:46 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
>>
>> The cost to form the inner hash is basically negligible whether it's
>> de-duped or not, but if it's not (known) de-duped then the cost
>> estimate for the semijoin is going to rise some, and that discourages
>> selecting it.
>>
>
> Why does the "hash semi join" care about duplication of values on the
> inner relation? Doesn't it only care whether a given bucket exists
> irrespective of its contents?
>

Rather, it cares about the contents is-so-far as confirming that at least
one of the tuples in the bucket indeed has the same joining value as the
outer relation (lost track of the fact that two values can share the same
hash). But once it finds one it can move onto the new outer relation tuple
while an inner join would have to spend more time looking for additional
matches.

David J.

In response to

Re: Perfomance of IN-clause with many elements and possible solutions at 2017-07-25 02:58:18 from David G. Johnston

Responses

Re: Perfomance of IN-clause with many elements and possible solutions at 2017-07-31 18:03:32 from Jeff Janes

Browse pgsql-general by date

	From	Date	Subject
Next Message	Tom Lane	2017-07-25 03:11:09	Re: Perfomance of IN-clause with many elements and possible solutions
Previous Message	David G. Johnston	2017-07-25 02:58:18	Re: Perfomance of IN-clause with many elements and possible solutions