Re: prevent immature WAL streaming

From: "Bossart, Nathan" <bossartn(at)amazon(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, 蔡梦娟(玊于) <mengjuan(dot)cmj(at)alibaba-inc(dot)com>, Jakub Wartak <Jakub(dot)Wartak(at)tomtom(dot)com>
Subject: Re: prevent immature WAL streaming
Date: 2021-08-25 18:18:59
Message-ID: DE60B9AA-9670-47DA-9678-6C79BCD884E3@amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 8/25/21, 5:33 AM, "alvherre(at)alvh(dot)no-ip(dot)org" <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
> On 2021-Aug-24, Bossart, Nathan wrote:
>
>> If moving RegisterSegmentBoundary() is sufficient to prevent the flush
>> pointer from advancing before we register the boundary, I bet we could
>> also remove the WAL writer nudge.
>
> Can you elaborate on this? I'm not sure I see the connection.

The reason we are moving RegisterSegmentBoundary() to before
WALInsertLockRelease() is because we believe it will prevent boundary
registration from taking place after the flush pointer has already
advanced past the boundary in question. We had added the WAL writer
nudge to make sure we called NotifySegmentsReadyForArchive() whenever
that happened.

If moving boundary registration to before we release the lock(s) is
enough to prevent the race condition with the flush pointer, then ISTM
we no longer have to worry about nudging the WAL writer.

>> Another interesting thing I see is that the boundary stored in
>> earliestSegBoundary is not necessarily the earliest one. It's just
>> the first one that has been registered. I did this for simplicity for
>> the .ready file fix, but I can see it causing problems here.
>
> Hmm, is there really a problem here? Surely the flush point cannot go
> past whatever has been written. If somebody is writing an earlier
> section of WAL, then we cannot move the flush pointer to a later
> position. So it doesn't matter if the earliest point we have registered
> is the true earliest -- we only care for it to be the earliest that is
> past the flush point.

Let's say we have the following situation (F = flush, E = earliest
registered boundary, and L = latest registered boundary), and let's
assume that each segment has a cross-segment record that ends in the
next segment.

F E L
|-----|-----|-----|-----|-----|-----|-----|-----|
1 2 3 4 5 6 7 8

Then, we write out WAL to disk and create .ready files as needed. If
we didn't flush beyond the latest registered boundary, the latest
registered boundary now becomes the earliest boundary.

F E
|-----|-----|-----|-----|-----|-----|-----|-----|
1 2 3 4 5 6 7 8

At this point, the earliest segment boundary past the flush point is
before the "earliest" boundary we are tracking.

Nathan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bossart, Nathan 2021-08-25 18:30:10 Re: archive status ".ready" files may be created too early
Previous Message Fujii Masao 2021-08-25 18:00:10 Re: archive status ".ready" files may be created too early