From: | Jehan-Guillaume de Rorthais <jgdr(at)dalibo(dot)com> |
---|---|
To: | Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> |
Cc: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org, michael(at)paquier(dot)xyz |
Subject: | Re: [BUG] non archived WAL removed during production crash recovery |
Date: | 2020-04-03 16:26:25 |
Message-ID: | 20200403182625.0fccc6fd@firost |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs pgsql-hackers |
On Thu, 2 Apr 2020 23:55:46 +0900
Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
> On 2020/04/02 22:49, Jehan-Guillaume de Rorthais wrote:
>> On Thu, 2 Apr 2020 19:38:59 +0900
>> Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
[...]
>>> That is, WAL files with .ready files are removed when either
>>> archive_mode!=always in standby mode or archive_mode=off.
>>
>> sounds fine to me.
To some extends, it appears to me this sentence was relatively close to my
previous patch, as far as you exclude IN_CRASH_RECOVERY from the shortcut:
XLogArchiveCheckDone(const char *xlog)
{
[...]
if ( (inRecoveryState != IN_CRASH_RECOVERY) && (
(inRecoveryState == NOT_IN_RECOVERY && !XLogArchivingActive()) ||
(inRecoveryState == IN_ARCHIVE_RECOVERY && !XLogArchivingAlways())))
return true;
Which means that only .done cleanup would occurs during CRASH_RECOVERY
and .ready files might be created if no .done exists. No matter the futur status
of the cluster: primary or standby. Normal shortcut will apply during first
checkpoint after the crash recovery step.
This should handle the case where a backup without backup_label (taken from a
snapshot or after a shutdown with immediate) is restored to build a standby.
Please, find in attachment a third version of my patch
"0001-v3-Fix-WAL-retention-during-production-crash-recovery.patch".
"0002-v1-Add-test-on-non-archived-WAL-during-crash-recovery.patch" is left
untouched. But I'm considering adding some more tests relative to this
discussion.
> BTW, now I'm thinking that the flag in shmem should be updated when
> the startup process sets StandbyModeRequested to true at the beginning
> of the recovery. That is,
>
> - Add something like SharedStandbyModeRequested into XLogCtl. This field
> should be initialized with false;
> - Set XLogCtl->SharedStandbyModeRequested to true when the startup
> process detects the standby.signal file and sets the local variable
> StandbyModeRequested to true.
> - Make XLogArchiveCheckDone() use XLogCtl->SharedStandbyModeRequested
> to know whether the server is in standby mode or not.
>
> Thought?
I try to avoid a new flag in memory with the proposal in attachment of this
email. It seems to me various combinaisons of booleans with subtle differences
around the same subject makes it a bit trappy and complicated to understand.
If my proposal is rejected, I'll be happy to volunteer to add
XLogCtl->SharedStandbyModeRequested though. It seems like a simple enough fix
as well.
Regards,
Attachment | Content-Type | Size |
---|---|---|
0001-v3-Fix-WAL-retention-during-production-crash-recovery.patch | text/x-patch | 5.3 KB |
0002-v1-Add-test-on-non-archived-WAL-during-crash-recovery.patch | text/x-patch | 1.5 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Fujii Masao | 2020-04-03 17:49:50 | Re: [BUG] non archived WAL removed during production crash recovery |
Previous Message | Dmitry Dolgov | 2020-04-03 15:29:10 | Re: BUG #16325: Assert failure on partitioning by int for a text value with a collation |
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2020-04-03 16:32:01 | Re: zombie connections |
Previous Message | Ibrar Ahmed | 2020-04-03 16:04:34 | Re: VACUUM memory management |