From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Chris Travers <chris(dot)travers(at)adjust(dot)com> |
Cc: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Funny hang on PostgreSQL 10 during parallel index scan on slave |
Date: | 2018-09-05 16:55:11 |
Message-ID: | 20180905165511.kn76b6evpcvjpygt@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2018-09-05 18:48:44 +0200, Chris Travers wrote:
> Will submit a patch here shortly. Thanks! Should we do for master and
> 10? Or 9.6 too?
Please don't top-post on this list. This needs to be done in all
branches where the posix_fallocate call is present.
> > Yep, Maybe we should check for signals there.
> >
> > On Wed, Sep 5, 2018 at 5:27 PM Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
> > wrote:
> >
> >> On Wed, Sep 5, 2018 at 8:23 AM Chris Travers <chris(dot)travers(at)adjust(dot)com>
> >> wrote:
> >> > 1. The query is in a parallel index scan or similar
> >> > 2. A process is executing a parallel plan and allocating a significant
> >> chunk of memory (2MB for example) in dynamic shared memory.
> >> > 3. The startup process goes into a loop where it sends a sigusr1,
> >> sleeps 5m, and sends another sigusr1 etc.
> >> > 4. The sigusr1 aborts the system call, which is then retried.
> >> > 5. Because the system call takes more than 5ms, we end up in an
> >> endless loop
What you're presumably encountering here is a recovery conflict.
> On Wed, Sep 5, 2018 at 6:40 PM Chris Travers <chris(dot)travers(at)adjust(dot)com>
> wrote:
> >> Do you mean this loop in dsm_impl_posix_resize() is getting
> >> interrupted constantly and never completing?
> >>
> >> /* We may get interrupted, if so just retry. */
> >> do
> >> {
> >> rc = posix_fallocate(fd, 0, size);
> >> } while (rc == EINTR);
> >>
Probably worthwile to check that the dsm code is properly robust if
errors are thrown from within here.
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2018-09-05 16:55:34 | Re: Bug fix for glibc broke freebsd build in REL_11_STABLE |
Previous Message | Thomas Munro | 2018-09-05 16:53:42 | Re: Funny hang on PostgreSQL 10 during parallel index scan on slave |