Quick Links

Re: Perfomance of IN-clause with many elements and possible solutions

From:	"David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Dmitry Lazurkin <dilaz03(at)gmail(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject:	Re: Perfomance of IN-clause with many elements and possible solutions
Date:	2017-07-25 02:58:18
Message-ID:	CAKFQuwZXN8wOfYfKK=8pt4VUnXur0bGJH9uh2216eLQ6W9gFzw@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

On Mon, Jul 24, 2017 at 3:46 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

>
> The cost to form the inner hash is basically negligible whether it's
> de-duped or not, but if it's not (known) de-duped then the cost
> estimate for the semijoin is going to rise some, and that discourages
> selecting it.
>

Why does the "hash semi join" care about duplication of values on the
inner relation? Doesn't it only care whether a given bucket exists
irrespective of its contents?

Looking at those explains it would seem the "hash semi join" is simply an
inherently more expensive to execute compared to a "hash join" and that the
act of de-duping the inner relation would have to be quite expensive to
overcome the gap. I cannot reconcile this with the previous paragraph
though...

Pointing me to the readme or code file (comments) that explains this in
more detail would be welcome. Not sure what to grep for - "Hash Semi Join"
only turns up a couple of expected output results...

Thx.

David J.

In response to

Re: Perfomance of IN-clause with many elements and possible solutions at 2017-07-24 22:46:44 from Tom Lane

Responses

Re: Perfomance of IN-clause with many elements and possible solutions at 2017-07-25 03:03:24 from David G. Johnston
Re: Perfomance of IN-clause with many elements and possible solutions at 2017-07-25 03:11:09 from Tom Lane

Browse pgsql-general by date

	From	Date	Subject
Next Message	David G. Johnston	2017-07-25 03:03:24	Re: Perfomance of IN-clause with many elements and possible solutions
Previous Message	Jeff Janes	2017-07-25 02:50:07	Re: Perfomance of IN-clause with many elements and possible solutions