Re: archive status ".ready" files may be created too early

From: "Bossart, Nathan" <bossartn(at)amazon(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: "alvherre(at)alvh(dot)no-ip(dot)org" <alvherre(at)alvh(dot)no-ip(dot)org>, "x4mmm(at)yandex-team(dot)ru" <x4mmm(at)yandex-team(dot)ru>, "a(dot)lubennikova(at)postgrespro(dot)ru" <a(dot)lubennikova(at)postgrespro(dot)ru>, "hlinnaka(at)iki(dot)fi" <hlinnaka(at)iki(dot)fi>, "matsumura(dot)ryo(at)fujitsu(dot)com" <matsumura(dot)ryo(at)fujitsu(dot)com>, "masao(dot)fujii(at)gmail(dot)com" <masao(dot)fujii(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: archive status ".ready" files may be created too early
Date: 2021-08-03 21:32:18
Message-ID: A7D02168-A45F-40EF-B948-BDFF7FC9A321@amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 8/2/21, 7:37 PM, "Kyotaro Horiguchi" <horikyota(dot)ntt(at)gmail(dot)com> wrote:
> I'm afraid that using hash to store boundary info is too much. Isn't a
> ring buffer enough for this use? In that case it is enough to
> remember only the end LSN of the segment spanning records. It is easy
> to expand the buffer if needed.

I agree that the hash table requires a bit more memory than what is
probably necessary, but I'm not sure I agree that maintaining a custom
data structure to save a few kilobytes of memory is worth the effort.

> @@ -2602,7 +2855,7 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible)
> LogwrtResult.Flush = LogwrtResult.Write; /* end of page */
>
> if (XLogArchivingActive())
> - XLogArchiveNotifySeg(openLogSegNo);
> + SetLastNotifiedSegmentIfInvalid(openLogSegNo - 1);
>
> Is it safe? If server didn't notified of WAL files for recent 3
> finished segments in the previous server life, they need to be
> archived this life time. But this omits maybe all of the tree.
> (I didn't confirm that behavior..)

I tested this scenario out earlier [0]. It looks like the call to
XLogArchiveCheckDone() in RemoveOldXlogFiles() will take care of
creating any .ready files we missed.

>> I believe my worry was that we'd miss notifying a segment as soon as
>> possible if the record was somehow flushed prior to registering the
>> record boundary in the map. If that's actually impossible, then I
>> would agree that the extra call to NotifySegmentsReadyForArchive() is
>> unnecessary.
>
> I don't think that XLogWrite(up to LSN=X) can happen before
> XLogInsert(endpos = X) ends.

Is there anything preventing that from happening? At the location
where we are registering the record boundary, we've already called
CopyXLogRecordToWAL(), and neither the WAL insertion lock nor the
WALWriteLock are held. Even if we register the boundary before
updating the shared LogwrtRqst.Write, there's a chance that someone
else has already moved it ahead and called XLogWrite(). I think the
worst case scenario is that we hold off creating .ready files longer
than necessary, but IMO that's still a worthwhile thing to do.

Nathan

[0] https://postgr.es/m/DA71434B-7340-4984-9B91-F085BC47A778%40amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bossart, Nathan 2021-08-03 23:13:26 Re: make MaxBackends available in _PG_init
Previous Message Simon Riggs 2021-08-03 20:59:24 Re: Reduce lock level for ALTER TABLE ... ADD CHECK .. NOT VALID