Re: BUG #18514: Encountering an error invalid DSA memory alloc request size 1811939328 when executing script

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: iwebcas(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org, Melanie Plageman <melanieplageman(at)gmail(dot)com>
Subject: Re: BUG #18514: Encountering an error invalid DSA memory alloc request size 1811939328 when executing script
Date: 2024-06-18 02:03:31
Message-ID: CA+hUKGJRp4jezAiLvb0vROjPNWpoaP9zqS8_bcqixgE-FFRKsA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Tue, Jun 18, 2024 at 8:22 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Probably as a result of switching from custom to generic plan.
> There's not anything exciting about that, but we would like to
> get to the bottom of the DSA allocation failure.

Parallel Hash Right Join is new in 16 (commit 11c2d6fd, whose message
unfortunately focuses on JOIN_FULL but also covered JOIN_RIGHT). I
suspect that the allocation failure might be coming from trying to
allocate a huge array of batches in
ExecParallelHashJoinSetUpBatches(), so large that it exceeds the 1GB
allocation cap. I don't think it's the bucket array, because we avoid
exceeding the cap there. The only other thing it could be is a
massive tuple, which seems unlikely. If that's right, then the
question is: what is it about JOIN_RIGHT that is producing very large
nbatch?

There was a similar report[1] on -hackers. One of the links to
depesz's EXPLAIN viewer shows Parallel Hash Right Join, and the report
includes very high numbers of batches. Hmm.

[1] https://www.postgresql.org/message-id/flat/CAG4TxrizOVnkYx1v1a7rv6G3t4fMoZP6vbZn3yPLgjHrg5ETbw%40mail.gmail.com

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Thomas Munro 2024-06-18 02:20:08 Re: BUG #18514: Encountering an error invalid DSA memory alloc request size 1811939328 when executing script
Previous Message Melanie Plageman 2024-06-17 21:29:48 Re: relfrozenxid may disagree with row XIDs after 1ccc1e05ae