Re: inefficient loop in StandbyReleaseLockList()

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: tgl(at)sss(dot)pgh(dot)pa(dot)us
Cc: bossartn(at)amazon(dot)com, michael(at)paquier(dot)xyz, andres(at)anarazel(dot)de, sulamul(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: inefficient loop in StandbyReleaseLockList()
Date: 2021-11-02 02:43:13
Message-ID: 20211102.114313.713245704579079227.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Mon, 01 Nov 2021 18:01:18 -0400, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote in
> "Bossart, Nathan" <bossartn(at)amazon(dot)com> writes:
> > On 10/31/21, 1:55 PM, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> 2. I think we almost certainly have a problem in SyncPostCheckpoint.
>
> > This one doesn't look as straightforward. It looks like we might need
> > a list_delete_first_n() to delete the first N entries all at once to
> > improve this one.
>
> Yeah. We don't absolutely need a new list primitive: we could use
> list_copy_tail() and then free the old list. But the extra palloc
> traffic involved makes this sound like a bad idea. It does have
> the advantage that we could shorten the List's storage even when
> it doesn't go to empty, but I'm not sure that's worth anything.
> If the List isn't going to empty, that implies that we're getting
> a steady stream of unlink requests, meaning we'd probably just
> fill it up again.

I agree to that. In general I think it's better not to resize storage
on truncation (or shortning) of a list except for a few special cases.
(for example, for a list that temporarily grows prominently but won't
go empty)

> The minimum-change patch would have us truncating the list before
> each AbsorbSyncRequests call, so that the list state meets that
> function's expectations. However, as long as UNLINKS_PER_ABSORB
> is only 10, I don't think that gets us out of the O(N^2) woods.

Agreed.

> So what I did in the attached is add a "canceled" flag to
> PendingUnlinkEntry, which lets us deal with canceled or finished
> entries without having to delete them from the list right away.
> Then we only need to physically clean up the list once per
> SyncPostCheckpoint call.

We don't loop over so many canceled elements usually so I think it
works well. However, shouldn't we consider canceled before checking
cycle_ctr? A canceled elements should be invisible from later
accesses at all. I vaguely feel the name "cancel" might be better be
"consumed" or such but I don't object to "cancel".

I feel that we might need to wipe_mem for the memmove case as well
(together with list_delete_nth_cell) but that is another thing even if
that's correct.

Otherwise it looks good to me.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2021-11-02 03:33:32 Re: removing global variable ThisTimeLineID
Previous Message Michael Paquier 2021-11-02 01:23:54 Re: enabling FOO=bar arguments to vcregress.pl