Quick Links

Re: bad estimation together with large work_mem generates terrible slow hash joins

From:	Tomas Vondra <tv(at)fuzzy(dot)cz>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: bad estimation together with large work_mem generates terrible slow hash joins
Date:	2014-09-12 22:07:39
Message-ID:	54136EAB.10805@fuzzy.cz
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 12.9.2014 23:22, Robert Haas wrote:
> On Fri, Sep 12, 2014 at 4:55 PM, Tomas Vondra <tv(at)fuzzy(dot)cz> wrote:
>>> I'm actually quite surprised that you find batching to be a
>>> better strategy than skimping on buckets, because I would have
>>> expect the opposite, almost categorically. Batching means having
>>> to write out the tuples we can't process right away and read them
>>> back in. If that involves disk I/O, I think the cost of that I/O
>>> is going to be FAR more than the overhead we would have incurred
>>> by searching slightly longer bucket chains. If it didn't, then
>>> you could've set work_mem higher and avoided batching in the
>>> first place.
>>
>> No, I don't find batching to be a better strategy. I just think
>> this really needs more discussion than a simple "let's use
>> NTUP_PER_BUCKET=4 to avoid batching" follow-up patch.
>>
>> For example, let's say we switch to NTUP_PER_BUCKET=4 to avoid
>> batching, and then discover we need to start batching anyway.
>> Should we keep the settings, or should we revert NTUP_PER_BUCKET=1?
>> Or maybe not doing that for nbatch=2, but for nbatch=16?
>
> My first thought is to revert to NTUP_PER_BUCKET=1, but it's
> certainly arguable. Either method, though, figures to be better than
> doing nothing, so let's do something.

OK, but can we commit the remaining part first? Because it'll certainly
interact with this proposed part, and it's easier to tweak when the code
is already there. Instead of rebasing over and over.

Tomas

In response to

Re: bad estimation together with large work_mem generates terrible slow hash joins at 2014-09-12 21:22:58 from Robert Haas

Responses

Re: bad estimation together with large work_mem generates terrible slow hash joins at 2014-10-02 00:20:44 from Kevin Grittner

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Bruce Momjian	2014-09-13 00:44:47	formatting.c
Previous Message	Heikki Linnakangas	2014-09-12 21:38:34	Re: gist vacuum seaq access