From: | "Bossart, Nathan" <bossartn(at)amazon(dot)com> |
---|---|
To: | Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> |
Cc: | Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, "hlinnaka(at)iki(dot)fi" <hlinnaka(at)iki(dot)fi>, "matsumura(dot)ryo(at)fujitsu(dot)com" <matsumura(dot)ryo(at)fujitsu(dot)com>, "masao(dot)fujii(at)gmail(dot)com" <masao(dot)fujii(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: archive status ".ready" files may be created too early |
Date: | 2021-01-26 19:13:57 |
Message-ID: | 68120830-3A34-4C4F-942F-6739DAA664CF@amazon.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 12/17/20, 9:15 PM, "Kyotaro Horiguchi" <horikyota(dot)ntt(at)gmail(dot)com> wrote:
> At Thu, 17 Dec 2020 22:20:35 +0000, "Bossart, Nathan" <bossartn(at)amazon(dot)com> wrote in
>> On 12/15/20, 2:33 AM, "Kyotaro Horiguchi" <horikyota(dot)ntt(at)gmail(dot)com> wrote:
>> > You're right in that regard. There's a window where partial record is
>> > written when write location passes F0 after insertion location passes
>> > F1. However, remembering all spanning records seems overkilling to me.
>>
>> I'm curious why you feel that recording all cross-segment records is
>> overkill. IMO it seems far simpler to just do that rather than try to
>
> Sorry, my words are not enough. Remembering all spanning records in
> *shared memory* seems to be overkilling. Much more if it is stored in
> shared hash table. Even though it rarely the case, it can fail hard
> way when reaching the limit. If we could do well by remembering just
> two locations, we wouldn't need to worry about such a limitation.
I don't think it will fail if we reach max_size for the hash table.
The comment above ShmemInitHash() has this note:
* max_size is the estimated maximum number of hashtable entries. This is
* not a hard limit, but the access efficiency will degrade if it is
* exceeded substantially (since it's used to compute directory size and
* the hash table buckets will get overfull).
> Another concern about the concrete patch:
>
> NotifySegmentsReadyForArchive() searches the shared hashacquiaing a
> LWLock every time XLogWrite is called while segment archive is being
> held off. I don't think it is acceptable and I think it could be a
> problem when many backends are competing on WAL.
This is a fair point. I did some benchmarking with a few hundred
connections all doing writes, and I was not able to discern any
noticeable performance impact. My guess is that contention on this
new lock is unlikely because callers of XLogWrite() must already hold
WALWriteLock. Otherwise, I believe we only acquire ArchNotifyLock no
more than once per segment to record new record boundaries.
Nathan
From | Date | Subject | |
---|---|---|---|
Next Message | Bossart, Nathan | 2021-01-26 19:31:04 | Re: archive status ".ready" files may be created too early |
Previous Message | Finnerty, Jim | 2021-01-26 19:06:57 | Re: Challenges preventing us moving to 64 bit transaction id (XID)? |