Re: DBT-3 with SF=20 got failed

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: DBT-3 with SF=20 got failed
Date: 2015-09-08 18:44:06
Message-ID: CA+TgmoaWQyvKrm9jyYvZY=3mQVANJSBdmbwAHcpWjRwDHuG-Aw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 8, 2015 at 8:28 AM, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com> wrote:
>> Hello KaiGai-san,
>>
>> I've discovered a bug in the proposed patch - when resetting the hash
>> table after the first batch, it does this:
>>
>> memset(hashtable->buckets, 0, sizeof(nbuckets * sizeof(HashJoinTuple)));
>>
>> The first 'sizeof' is bogus, so this only zeroes the first 8 bytes of
>> the array (usually resulting in crashes due to invalid pointers).
>>
>> I fixed it to
>>
>> memset(hashtable->buckets, 0, nbuckets * sizeof(HashJoinTuple));
>>
>> Fixed patch attached (marked as v2).
>>
> Thanks, it was my bug, but oversight.
>
> I want committer to push this fix.

I'm not in agreement with this fix, and would prefer to instead
proceed by limiting the initial allocation to 1GB. Otherwise, as
KaiGai has mentioned, we might end up trying to allocate completely
unreasonable amounts of memory if the planner gives a bad estimate.
Of course, it's true (as Tomas points out) that this issue already
exists today to some degree, and it's also true (as he also points
out) that 1GB is an arbitrary limit. But it's also true that we use
that same arbitrary 1GB limit in a lot of places, so it's hardly
without precedent.

More importantly, removing the cap on the allocation size makes the
problem a lot worse. You might be sad if a bad planner estimate
causes the planner to allocate 1GB when 64MB would have been enough,
but on modern systems it is not likely to be an enormous problem. If
a similar mis-estimation causes the planner to allocate 16GB rather
than 1GB, the opportunity for you to be sad is magnified pretty
considerably. Therefore, I don't really see the over-estimation bug
fix as being separate from this one.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2015-09-08 18:53:46 Re: Hooking at standard_join_search (Was: Re: Foreign join pushdown vs EvalPlanQual)
Previous Message Petr Jelinek 2015-09-08 18:40:42 Re: Horizontal scalability/sharding