Re: prevent immature WAL streaming

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: bossartn(at)amazon(dot)com
Cc: robertmhaas(at)gmail(dot)com, alvherre(at)alvh(dot)no-ip(dot)org, pgsql-hackers(at)lists(dot)postgresql(dot)org, mengjuan(dot)cmj(at)alibaba-inc(dot)com, Jakub(dot)Wartak(at)tomtom(dot)com
Subject: Re: prevent immature WAL streaming
Date: 2021-08-26 08:48:34
Message-ID: 20210826.174834.1618367955913954817.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Thu, 26 Aug 2021 03:24:48 +0000, "Bossart, Nathan" <bossartn(at)amazon(dot)com> wrote in
> On 8/25/21, 5:40 PM, "Kyotaro Horiguchi" <horikyota(dot)ntt(at)gmail(dot)com> wrote:
> > At Wed, 25 Aug 2021 18:18:59 +0000, "Bossart, Nathan" <bossartn(at)amazon(dot)com> wrote in
> >> Let's say we have the following situation (F = flush, E = earliest
> >> registered boundary, and L = latest registered boundary), and let's
> >> assume that each segment has a cross-segment record that ends in the
> >> next segment.
> >>
> >> F E L
> >> |-----|-----|-----|-----|-----|-----|-----|-----|
> >> 1 2 3 4 5 6 7 8
> >>
> >> Then, we write out WAL to disk and create .ready files as needed. If
> >> we didn't flush beyond the latest registered boundary, the latest
> >> registered boundary now becomes the earliest boundary.
> >>
> >> F E
> >> |-----|-----|-----|-----|-----|-----|-----|-----|
> >> 1 2 3 4 5 6 7 8
> >>
> >> At this point, the earliest segment boundary past the flush point is
> >> before the "earliest" boundary we are tracking.
> >
> > We know we have some cross-segment records in the regin [E L] so we
> > cannot add a .ready file if flush is in the region. I haven't looked
> > the latest patch (or I may misunderstand the discussion here) but I
> > think we shouldn't move E before F exceeds previous (or in the first
> > picture above) L. Things are done that way in my ancient proposal in
> > [1].
>
> The strategy in place ensures that we track a boundary that doesn't
> change until the flush position passes it as well as the latest
> registered boundary. I think it is important that any segment
> boundary tracking mechanism does at least those two things. I don't
> see how we could do that if we didn't update E until F passed both E
> and L.

(Sorry, but I didn't get you clearly. So the discussion below might be
pointless.)

The ancient patch did:

If a flush didn't reach E, we can archive finished segments.

If a flush ends between E and L, we shouldn't archive finshed segments
at all. L can be moved further while this state, while E sits on the
same location while this state.

Once a flush passes L, we can archive all finished segments and can
erase both E and L.

A drawback of this strategy is that the region [E L] can contain gaps
(that is, segment boundaries that is not bonded by a continuation
record) and archive can be excessively retarded. Perhaps if flush
goes behind write head by more than two segments, the probability of
creating the gaps would be higher.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2021-08-26 09:13:33 Re: ExecRTCheckPerms() and many prunable partitions
Previous Message Daniel Gustafsson 2021-08-26 08:48:14 Re: list of acknowledgments for PG14