From: | Simon Riggs <simon(at)2ndquadrant(dot)com> |
---|---|
To: | Dilip Kumar <dilipbalaut(at)gmail(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Is Recovery actually paused? |
Date: | 2020-10-21 16:07:37 |
Message-ID: | CANP8+jKkoJEKDz3vk5LnDcW0rJp_F5_j-ve5aALAUs=JufCgzQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, 21 Oct 2020 at 12:16, Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Tue, Oct 20, 2020 at 5:59 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> >
> > On Tue, Oct 20, 2020 at 3:00 PM Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> > >
> > > On Tue, 20 Oct 2020 at 09:50, Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> > >
> > > > > > Why would we want this? What problem are you trying to solve?
> > > > >
> > > > > The requirement is to know the last replayed WAL on the standby so
> > > > > unless we can guarantee that the recovery is actually paused we can
> > > > > never get the safe last_replay_lsn value.
> > > > >
> > > > > > If we do care, why not fix pg_is_wal_replay_paused() so it responds as you wish?
> > > > >
> > > > > Maybe we can also do that but pg_is_wal_replay_paused is an existing
> > > > > API and the behavior is to know whether the recovery paused is
> > > > > requested or not, So I am not sure is it a good idea to change the
> > > > > behavior of the existing API?
> > > > >
> > > >
> > > > Attached is the POC patch to show what I have in mind.
> > >
> > > If you don't like it, I doubt anyone else cares for the exact current
> > > behavior either. Thanks for pointing those issues out.
> > >
> > > It would make sense to alter pg_wal_replay_pause() so that it blocks
> > > until paused.
> > >
> > > I suggest you add the 3-value state as you suggest, but make
> > > pg_is_wal_replay_paused() respond:
> > > if paused, true
> > > if requested, wait until paused, then return true
> > > else false
> > >
> > > That then solves your issues with a smoother interface.
> > >
> >
> > Make sense to me, I will change as per the suggestion.
>
> I have noticed one more issue, the problem is that if the recovery
> process is currently not processing any WAL and just waiting for the
> WAL to become available then the pg_is_wal_replay_paused will be stuck
> forever. Having said that there is the same problem even if we design
> the new interface which checks whether the recovery is actually paused
> or not because until the recovery process gets the next wal it will
> not check whether the recovery pause is requested or not so the actual
> recovery paused flag will never be set.
>
> One idea could be, if the recovery process is waiting for WAL and a
> recovery pause is requested then we can assume that the recovery is
> paused because before processing the next wal it will always check
> whether the recovery pause is requested or not.
If ReadRecord() is waiting for WAL (at bottom of recovery loop), then
when it does return it will immediately move to pause (at top of next
loop). Which makes it easy to cover these cases.
It would be easy enough to create another variable that shows "waiting
for WAL", since that is in itself a useful and interesting thing to be
able to report.
pg_is_wal_replay_paused() and pg_wal_replay_pause() would then return
whenever it is either (fully paused || waiting for WAL &&
pause_requested))
We can then create a new function called pg_wal_replay_status() that
returns multiple values: RECOVERING | WAITING_FOR_WAL | PAUSED
--
Simon Riggs http://www.EnterpriseDB.com/
From | Date | Subject | |
---|---|---|---|
Next Message | Mark Dilger | 2020-10-21 16:14:53 | Re: refactoring basebackup.c |
Previous Message | Robert Haas | 2020-10-21 15:31:54 | Re: Timing of relcache inval at parallel worker init |