From: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> |
---|---|
To: | higuchi(dot)daisuke(at)fujitsu(dot)com |
Cc: | masao(dot)fujii(at)oss(dot)nttdata(dot)com, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: [Bug fix]There is the case archive_timeout parameter is ignored after recovery works. |
Date: | 2020-06-30 00:14:23 |
Message-ID: | 20200630.091423.1542668690374745736.horikyota.ntt@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Opps! I misunderstood that.
At Mon, 29 Jun 2020 13:00:25 +0000, "higuchi(dot)daisuke(at)fujitsu(dot)com" <higuchi(dot)daisuke(at)fujitsu(dot)com> wrote in
> Fujii-san, thank you for comments.
>
> >The cause of this problem is that the checkpointer's sleep time is calculated
> >from both checkpoint_timeout and archive_timeout during normal running,
> >but calculated only from checkpoint_timeout during recovery. So Daisuke-san's
> >patch tries to change that so that it's calculated from both of them even
> >during recovery. No?
>
> Yes, it's exactly so.
>
> >last_xlog_switch_time is not updated during recovery. So "elapsed_secs" can be
> >large and cur_timeout can be negative. Isn't this problematic?
>
> Yes... My patch was missing this.
The patch also makes WaitLatch called with zero timeout, which causes
assertion failure.
> How about using the original archive_timeout value for calculating cur_timeout during recovery?
>
> if (XLogArchiveTimeout > 0 && !RecoveryInProgress())
> {
> elapsed_secs = now - last_xlog_switch_time;
> if (elapsed_secs >= XLogArchiveTimeout)
> continue; /* no sleep for us ... */
> cur_timeout = Min(cur_timeout, XLogArchiveTimeout - elapsed_secs);
> }
> + else if (XLogArchiveTimeout > 0)
> + cur_timeout = Min(cur_timeout, XLogArchiveTimeout);
>
> During recovery, accurate cur_timeout is not calculated because elapsed_secs is not used.
> However, after recovery is complete, WAL archiving will start by the next archive_timeout is reached.
> I felt it is enough to solve this problem.
That causes unwanted change of cur_timeout during recovery.
> >As another approach, what about waking the checkpointer up at the end of
> >recovery like we already do for walsenders?
We don't want change checkpoint interval during recovery, that means
we cannot cnosider archive_timeout at the fist checkpointer after
recovery ends. So I think that the suggestion from Fujii-san is the
direction.
> If the above solution is not good, I will consider this approach.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
From | Date | Subject | |
---|---|---|---|
Next Message | Fujii Masao | 2020-06-30 01:01:02 | Re: [Bug fix]There is the case archive_timeout parameter is ignored after recovery works. |
Previous Message | David Rowley | 2020-06-29 23:57:03 | Re: Hybrid Hash/Nested Loop joins and caching results from subplans |