Re: Reference to - BUG #18349: ERROR: invalid DSA memory alloc request size 1811939328, CONTEXT: parallel worker

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Andrei Lepikhov <lepihov(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Craig Milhiser <craig(at)milhiser(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: Reference to - BUG #18349: ERROR: invalid DSA memory alloc request size 1811939328, CONTEXT: parallel worker
Date: 2024-10-16 09:19:14
Message-ID: CA+hUKGKOoxiGBqv=RA-e8=LXNhdmRCC_rOYbpaqbL16=4wvJ3A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Mon, Oct 14, 2024 at 10:16 PM Andrei Lepikhov <lepihov(at)gmail(dot)com> wrote:
> On 10/14/24 13:26, Tom Lane wrote:
> > Interesting point. If memory serves (I'm too tired to actually look)
> > the planner considers the statistical most-common-value when
> > estimating whether an unsplittable hash bucket is likely to be too
> > big. It does *not* think about null values ... but it ought to.

Right, there might be something to think about there. There might
also be an opportunity to treat NULL-key tuples specially during
execution since they can't possibly match.

> As I see it, it is just an oversight in the resizing logic: batch 0
> doesn't change the estimated_size value at all - I think because it
> doesn't matter for this batch - it can't be treated as exhausted by
> definition. Because of that, parallel HashJoin doesn't detect extreme
> skew, caused by duplicates in this batch. NULLS is just our luck - they
> correspond to hash value 0 and fall into this batch.
> See the attachment for a sketch of the solution.

Thanks Andrei, I mostly agree with your analysis, but I came up with a
slightly different patch. I think we should check for extreme skew if
old_batch->space_exhausted (the parent partition). Your sketch always
does it for batch 0, which works for these examples but I don't think
it's strictly correct: if batch 0 didn't run out of memory, it might
falsely report extreme skew just because it had (say) 0 or 1 tuples.

Attachment Content-Type Size
0001-Fix-extreme-skew-detection-in-Parallel-Hash-Join.patch text/x-patch 3.5 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message bjdev.gthb 2024-10-16 10:10:40 Re: BUG #18654: From fuzzystrmatch, levenshtein function with costs parameters produce incorrect results
Previous Message Tender Wang 2024-10-16 09:18:55 Re: BUG #18657: Using JSON_OBJECTAGG with volatile function leads to segfault