Re: crash with synchronized_standby_slots

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Gabriele Bartolini <gabriele(dot)bartolini(at)enterprisedb(dot)com>
Subject: Re: crash with synchronized_standby_slots
Date: 2024-11-29 06:01:48
Message-ID: CAA4eK1JmXd8VsKkH2dpnw5un-GUHd3PjaGwZZW73TDVKrHvN=Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Nov 28, 2024 at 5:54 PM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
>
> Gabriele just reported a crash when changing synchronized_standby_slots
> under SIGHUP and logging collector working. The problem apparently is
> that validate_sync_standby_slots is run for the GUC check routine, and
> it requires acquiring an LWLock; but syslogger doesn't have a PGPROC so
> that doesn't work.
>
> Apparently we already have a hack in place against the same problem in
> postmaster (and elsewhere?), by testing that ReplicationSlotCtl is not
> null. But that hack seems to be incomplete, as the crash here attests.
>

I tried it on my Windows machine and noticed that ReplicationSlotCtl
is NULL for syslogger, so the problem doesn't occur. The reason is
that we don't attach to shared memory in syslogger, so ideally
ReplicationSlotCtl should be NULL. Because we inherit everything
through the fork for Linux systems and then later for processes that
don't want to attach to shared memory, we call PGSharedMemoryDetach()
from postmaster_child_launch(). The PGSharedMemoryDetach won't
reinitialize the memory pointed to by ReplicationSlotCtl, so, it would
be an invalid memory.

> To reproduce, simply start with no synchronized_standby_slots setting
> and logging_collector=on, then change the value in postgresql.conf and
> reload.
>
> One option to fix this might be as attached. The point here is that
> processes without a PGPROC don't need or care to have a correct setting,
> so we just skip verifying it on those. AFAICS this is more general than
> the test for ReplicationSlotCtl, so I just removed that one.
>

Yeah, I can't think of a better fix for this.

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2024-11-29 06:18:55 Re: Отв.: Re: UUID v7
Previous Message Alena Rybakina 2024-11-29 05:51:30 Re: POC, WIP: OR-clause support for indexes