Re: BUG #15041: dsa alloc_object null pointer

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: daniel(at)fdr(dot)io, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #15041: dsa alloc_object null pointer
Date: 2018-01-31 20:04:57
Message-ID: CAEepm=2SicfYv7_2+qxMfXPfkKNJFhc9J_xuCqc5Rgb6tQpmdw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Thu, Feb 1, 2018 at 8:48 AM, PG Bug reporting form
<noreply(at)postgresql(dot)org> wrote:
> The following bug has been logged on the website:
>
> Bug reference: 15041
> Logged by: Daniel Farina
> Email address: daniel(at)fdr(dot)io
> PostgreSQL version: 10.1
> Operating system: Linux
> Description:
>
> A database that was operating normally for quite a while suddenly generated
> three similar looking core-dumps near one another. The stack traces look
> like this.
>
> It is possible there was unusual memory pressure, at the time this occurred.
> This is the first occurrence.
>
> #0 alloc_object (size_class=<optimized out>, area=0x0) at dsa.c:1433
> #1 dsa_allocate_extended (area=0x0, size=size(at)entry=72,
> flags=flags(at)entry=4) at dsa.c:785
> #2 0x000000000062d277 in tbm_prepare_shared_iterate
> (tbm=tbm(at)entry=0x1e54160) at tidbitmap.c:807
> #3 0x00000000005f69a0 in BitmapHeapNext (node=node(at)entry=0x1d22a48) at
> nodeBitmapHeapscan.c:155

Hi Daniel,

Thanks for the report. This looks like the bug described here, where
"area" is a NULL pointer because we failed to launch a parallel query
(ie we're running a parallel query plan, but there are no workers and
no shared memory):

https://www.postgresql.org/message-id/CAEepm=0kADK5inNf_KuemjX=HQ=PuTP0DykM--fO5jS5ePVFEA@mail.gmail.com

It was fixed in commit c6755e233be1cccadd0884d952a2bb455fa0db1f and
back patched to REL_10_STABLE, so the fix will be in 10.2 (target 8th
Feb). The cause is running out of DSM slots, but not handing that
case correctly. I think this implies that you're running queries with
a lot of Gather [Merge] nodes in them? The number of DSM slots is 64 +
2 * max_connections, so one workaround is to crank up max_connections,
and another is just to disable parallelism for that query.

--
Thomas Munro
http://www.enterprisedb.com

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Thomas Munro 2018-01-31 20:17:39 Re: BUG #15041: dsa alloc_object null pointer
Previous Message PG Bug reporting form 2018-01-31 19:48:30 BUG #15041: dsa alloc_object null pointer