Re: doc: Mention clock synchronization recommendation for hot_standby_feedback

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: doc: Mention clock synchronization recommendation for hot_standby_feedback
Date: 2025-03-05 06:15:58
Message-ID: CAA4eK1KwLM84n7UnGsCrG6rROO4jM-QrhACqXwhQYRx6yyTGsg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 4, 2025 at 4:44 PM Jakub Wartak
<jakub(dot)wartak(at)enterprisedb(dot)com> wrote:
>
> On Tue, Mar 4, 2025 at 4:59 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> > > >
> > > > Sure thing. I've just added '(..) In the extreme cases this can..' as
> > > > it is pretty rare to hit it. Patch attached.
> > >
> > > When the clock moves forward or backward, couldn't it affect
> > > not only the standby but also the primary? I’m wondering
> > > because TimestampDifferenceExceeds() seems to be used
> > > in several places in addition to hot standby feedback.
> > >
> >
> > Right, it could impact other places as well, like background WAL flush
> > being delayed. So, what should we do about this? Shall we leave this
> > as is, make a general statement, find all cases and make a note about
> > them in docs, do it for the important ones where the impact is more,
> > or something else?
>
> Given the occurrence of such conditions is almost close to 0, we could
> just open a new separate doc thread/cfentry if somebody is concerned
> and add some general statement that OS time should not jump too much
> (in some installation section), that it should be slewed (gradually
> adjusted) instead. If someone has time jumping on his box back and
> forth and something stops working , I still think he has bigger issues
> (e.g. now() reflecting wrong data). I would stay vague as much as
> possible, because every installation seems to use something different
> (hypervisor, kernel modules, ntpd vs ntpd -x and so on).
>
> The problem here was that standby was deteriorating primary (so you
> couldn't see easily on primary what could be causing this), so IMHO
> patch is fine as it stands, it just adds another not so known reason
> to the pool of knowledge why backend_xmin might stop propagating.
>

I can go with the last patch as you observed that in a real-world
case, and we can look at others (if any) on a case-to-case basis.
Fujii-San, others, do you have any opinion on this?

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2025-03-05 06:25:57 Re: Add contrib/pg_logicalsnapinspect
Previous Message Zhou, Zhiguo 2025-03-05 05:39:44 Re: [RFC] Lock-free XLog Reservation from WAL