Re: standby apply lag on inactive servers

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: alvherre(at)2ndquadrant(dot)com
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: standby apply lag on inactive servers
Date: 2020-01-30 02:19:17
Message-ID: 20200130.111917.2199841860956912043.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Wed, 29 Jan 2020 19:11:31 -0300, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote in
> On 2020-Jan-28, Kyotaro Horiguchi wrote:
>
> > The aim of the patch seem reasonable. XLOG_END_OF_RECOVERY and
> > XLOG_XACT_PREPARE also have a timestamp but it doesn't help much. (But
> > could be included for consistency.)
>
> Hmm I think I should definitely include those.

I agree to that, given the following change of log messages.

> > The timestamp of a checkpoint record is the start time of a checkpoint
> > (and doesn't have subseconds part, but it's not a problem.). That
> > means that the timestamp is usually far behind the time at the record
> > has been inserted. As the result the stored timestamp can go back by a
> > significant internval. I don't think that causes an actual problem but
> > the movement looks wierd as the result of
> > pg_last_xact_replay_timestamp().
>
> A problem I see with this is that setting the timestamp is done with
> XLogCtl->lock held; since now we need to make the update conditional,
> we'd have to read the current value, compare with the checkpoint time,
> then set, all with the spinlock held. That seems way too expensive.
>
> A compromise might be to do the compare only when it's done for
> checkpoint. These occur seldom enough that it shouldn't be a problem
> (as opposed to commit records, which can be very frequent).

I think we don't need to that, given the following change.

> > Asides from the backward movement, a timestamp from other than
> > commit/abort records in recvoeryLastXTime affects the following code.
> >
> > xlog.c:7329: (similar code exists at line 9332)
> > > ereport(LOG,
> > > (errmsg("redo done at %X/%X",
> > > (uint32) (ReadRecPtr >> 32), (uint32) ReadRecPtr)));
> > > xtime = GetLatestXTime();
> > > if (xtime)
> > > ereport(LOG,
> > > (errmsg("last completed transaction was at log time %s",
> > > timestamptz_to_str(xtime))));
> >
> > This code assumes (and the name GetLatestXTime() suggests, I first
> > noticed that here..) that the timestamp comes from commit/abort logs,
> > so otherwise it shows a wrong timestamp. We shouldn't update the
> > variable by other than that kind of records.
>
> Thinking about this some more, I think we should do keep the message the
> same backbranches (avoid breaking anything that might be reading the log
> -- a remote but not inexistent possibility), and adjust the message in
> master to be something like "last timestamped WAL activity at time %s",
> and document that it means commit, abort, restore label, checkpoint.

Agreed about backbranches. I'd like to preserve the word "transaction"
as it is more familiar to users. How about something like the follows?

"transactions are completed up to log time %s"

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2020-01-30 02:19:29 Re: Physical replication slot advance is not persistent
Previous Message Kyotaro Horiguchi 2020-01-30 01:54:37 Re: Tid scan increments value of pg_stat_all_tables.seq_scan. (but not seq_tup_read)