Re: crash with synchronized_standby_slots

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Gabriele Bartolini <gabriele(dot)bartolini(at)enterprisedb(dot)com>
Subject: Re: crash with synchronized_standby_slots
Date: 2024-12-04 05:32:18
Message-ID: CAA4eK1+yxew1rh5p4kExPa+utwcBc=KC9Y7NW8w_qJY9bzdfug@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 3, 2024 at 10:34 PM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
>
> On 2024-Nov-29, Amit Kapila wrote:
>
> BTW it occurs to me that there might well be some sort of thundering
> herd problem if every process needs to run the check_hook when a SIGHUP
> is broadcast, and they'll all be waiting on that particular lwlock and
> run the same validation locally again and again. I bet if you have a
> few thousand backends (hi Jakub! [1]) it's problematic.
>

The lock is taken in shared mode, so, ideally, it shouldn't create a
problem but if it ever creates a problem, we can even skip that check
during validation. The validation will anyway happen later during
replication in StandbySlotsHaveCaughtup(). This check is mostly to
detect the error in GUC early.

> Maybe we need a
> different way to validate the GUC, but I don't know what that might be;
> but doing the validation once and storing the result in shmem might be
> better.
>

What if that particular GUC changes again? We may need some additional
invalidation mechanism.

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2024-12-04 05:35:43 Re: Memory leak in WAL sender with pgoutput (v10~)
Previous Message Tom Lane 2024-12-04 05:22:34 Re: Remove useless casts to (void *)