Re: Parallel heap vacuum

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel heap vacuum
Date: 2024-10-22 23:54:34
Message-ID: CAD21AoC6Udm=Eo2BT1c473_QFxmj7psgrvDrKFR7qeAJGHOETw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Sorry for the very late reply.

On Tue, Jul 30, 2024 at 8:54 PM Hayato Kuroda (Fujitsu)
<kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
>
> Dear Sawada-san,
>
> > Thank you for testing!
>
> I tried to profile the vacuuming with the larger case (40 workers for the 20G table)
> and attached FlameGraph showed the result. IIUC, I cannot find bottlenecks.
>
> > > 2.
> > > I compared parallel heap scan and found that it does not have compute_worker
> > API.
> > > Can you clarify the reason why there is an inconsistency?
> > > (I feel it is intentional because the calculation logic seems to depend on the
> > heap structure,
> > > so should we add the API for table scan as well?)
> >
> > There is room to consider a better API design, but yes, the reason is
> > that the calculation logic depends on table AM implementation. For
> > example, I thought it might make sense to consider taking the number
> > of all-visible pages into account for the calculation of the number of
> > parallel workers as we don't want to launch many workers on the table
> > where most pages are all-visible. Which might not work for other table
> > AMs.
> >
>
> Okay, thanks for confirming. I wanted to ask others as well.
>
>
> > I'm updating the patch to implement parallel heap vacuum and will
> > share the updated patch. It might take time as it requires to
> > implement shared iteration support in radx tree.
>
> Here are other preliminary comments for v2 patch. This does not contain
> cosmetic ones.
>
> 1.
> Shared data structure PHVShared does not contain the mutex lock. Is it intentional
> because they are accessed by leader only after parallel workers exit?
>

Yes, the fields in PHVShared are read-only for workers. Since no
concurrent reads/writes happen on these fields we don't need to
protect them.

> 2.
> Per my understanding, the vacuuming goes like below steps.
>
> a. paralell workers are launched for scanning pages
> b. leader waits until scans are done
> c. leader does vacuum alone (you may extend here...)
> d. parallel workers are launched again to cleanup indeces
>
> If so, can we reuse parallel workers for the cleanup? Or, this is painful
> engineering than the benefit?

I've not thought of this idea but I think it's possible from a
technical perspective. It saves overheads of relaunching workers but
I'm not sure how much it would help for a better performance and I'm
concerned it would make the code complex. For example, different
numbers of workers might be required for table vacuuming and index
vacuuming. So we would end up increasing or decreasing workers.

> 3.
> According to LaunchParallelWorkers(), the bgw_name and bgw_type are hardcoded as
> "parallel worker ..." Can we extend this to improve the trackability in the
> pg_stat_activity?

It would be a good improvement for better trackability. But I think we
should do that in a separate patch as it's not just a problem for
parallel heap vacuum.

>
> 4.
> I'm not the expert TidStore, but as you said TidStoreLockExclusive() might be a
> bottleneck when tid is added to the shared TidStore. My another primitive idea
> is that to prepare per-worker TidStores (in the PHVScanWorkerState or LVRelCounters?)
> and gather after the heap scanning. If you extend like parallel workers do vacuuming,
> the gathering may not be needed: they can access own TidStore and clean up.
> One downside is that the memory consumption may be quite large.
>

Interesting idea. Suppose we support parallel heap vacuum as well, we
wouldn't need locks and to support shared-iteration on TidStore. I
think each worker should use a fraction of maintenance_work_mem.
However, one downside would be that we need to check as many TidStore
as workers during index vacuuming.

FYI I've implemented the parallel heap vacuum part and am doing some
benchmark tests. I'll share the updated patches along with test
results this week.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Shinoda, Noriyoshi (SXD Japan FSIP) 2024-10-22 23:58:31 RE: Statistics Import and Export
Previous Message David G. Johnston 2024-10-22 23:49:28 Re: Unexpected table size usage for small composite arrays