From: | "Bossart, Nathan" <bossartn(at)amazon(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com>, "alvherre(at)alvh(dot)no-ip(dot)org" <alvherre(at)alvh(dot)no-ip(dot)org> |
Cc: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "x4mmm(at)yandex-team(dot)ru" <x4mmm(at)yandex-team(dot)ru>, "a(dot)lubennikova(at)postgrespro(dot)ru" <a(dot)lubennikova(at)postgrespro(dot)ru>, "hlinnaka(at)iki(dot)fi" <hlinnaka(at)iki(dot)fi>, "matsumura(dot)ryo(at)fujitsu(dot)com" <matsumura(dot)ryo(at)fujitsu(dot)com>, "masao(dot)fujii(at)gmail(dot)com" <masao(dot)fujii(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: archive status ".ready" files may be created too early |
Date: | 2021-08-20 19:41:28 |
Message-ID: | 029A2429-21B1-426F-A8FE-109ADF6A8E90@amazon.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 8/20/21, 11:20 AM, "Robert Haas" <robertmhaas(at)gmail(dot)com> wrote:
> On Fri, Aug 20, 2021 at 1:29 PM Bossart, Nathan <bossartn(at)amazon(dot)com> wrote:
>> Thinking about this stuff further, I was wondering if one way to
>> handle the bounded shared hash table problem would be to replace the
>> latest boundary in the map whenever it was full. But at that point,
>> do we even need a hash table? This led me to revisit the two-element
>> approach that was discussed upthread. What if we only stored the
>> earliest and latest segment boundaries at any given time? Once the
>> earliest boundary is added, it never changes until the segment is
>> flushed and it is removed. The latest boundary, however, will be
>> updated any time we register another segment. Once the earliest
>> boundary is removed, we replace it with the latest boundary. This
>> strategy could cause us to miss intermediate boundaries, but AFAICT
>> the worst case scenario is that we hold off creating .ready files a
>> bit longer than necessary.
>
> I think this is a promising approach. We could also have a small
> fixed-size array, so that we only have to risk losing track of
> anything when we overflow the array. But I guess I'm still unconvinced
> that there's a real possibility of genuinely needing multiple
> elements. Suppose we are thinking of adding a second element to the
> array (or the hash table). I feel like it's got to be safe to just
> remove the first one. If not, then apparently the WAL record that
> caused us to make the first entry isn't totally flushed yet - which I
> still think is impossible.
I've attached a patch to demonstrate what I'm thinking.
Nathan
Attachment | Content-Type | Size |
---|---|---|
v13-0001-Avoid-creating-archive-status-.ready-files-too-e.patch | application/octet-stream | 18.1 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2021-08-20 20:06:02 | Re: postgres_fdw: Handle boolean comparison predicates |
Previous Message | Peter Geoghegan | 2021-08-20 19:40:08 | Re: The Free Space Map: Problems and Opportunities |