Re: inefficient loop in StandbyReleaseLockList()

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: sulamul(at)gmail(dot)com
Cc: bossartn(at)amazon(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: inefficient loop in StandbyReleaseLockList()
Date: 2021-10-28 06:57:51
Message-ID: 20211028.155751.1898631306946946681.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Thu, 28 Oct 2021 09:54:42 +0530, Amul Sul <sulamul(at)gmail(dot)com> wrote in
> On Thu, Oct 28, 2021 at 9:07 AM Bossart, Nathan <bossartn(at)amazon(dot)com> wrote:
> >
> > Hi hackers,
> >
> > I've seen a few cases now for v13 where the startup process on a
> > standby appears to be stuck on StandbyReleaseLockList(). It looks
> > like most of the time is spent on list_delete_first(). I believe this
> > is related to the recent list rewrite (1cff1b9), which has the
> > following note in the commit message:
> >
> > * Inserting or deleting a list element now takes time proportional to
> > the distance to the end of the list, due to moving the array elements.
> > (However, it typically *doesn't* require palloc or pfree, so except in
> > long lists it's probably still faster than before.) Notably, lcons()
> > used to be about the same cost as lappend(), but that's no longer true
> > if the list is long. Code that uses lcons() and list_delete_first()
> > to maintain a stack might usefully be rewritten to push and pop at the
> > end of the list rather than the beginning.
> >
> > The current form of StandbyReleaseLockList() is something like this:
> >
> > while (mylist != NIL)
> > {
> > int i = linitial_int(mylist);
> > ...
> > mylist = list_delete_first(mylist);
> > }
> >
> > For a long enough list, this is wildly inefficient. The following
> > form is much faster for longer lists:
> >
> > foreach(lc, mylist)
> > {
> > int i = lfirst_int(lc);
> > ...
> > }
> > list_free(mylist);
> >
> > I wrote up a simple test function for each form. For a list of
> > 500,000 integers, the first form took about a minute, while the second
> > form took about 6 milliseconds.

Nice finding!

> > I've attached a patch that converts StandbyReleaseLockList() to the
> > second loop form.
> >
>
> +1, deleting everything at once is much better. Deleting one by one
> using list_delete_first is a bit heavy; does the element shifting that
> involves memory allocation and copy operation which is unnecessary
> here.

+1.

I found several other instances of the pattern
"while(list){list_delete_first(); /*no-break*/}" in
llvm_release_context, gistProcessEmptyingQueue, AtEOXact_Namespace and
maybe transformGraph and processState in trgm_regexp.c. We might want
to apply this technique to the three first, and maybe to the last two.

However, I'm fine with fixing only StandbyRelaseLockList(), which
actually suffers from list_delete_first().

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2021-10-28 07:11:16 Re: Isn't it better with "autovacuum worker...." instead of "worker took too long to start; canceled" specific to "auto
Previous Message Takashi Menjo 2021-10-28 06:09:29 Re: Map WAL segment files on PMEM as WAL buffers