Quick Links

Re: Parallel heap vacuum

From:	Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To:	John Naylor <johncnaylorls(at)gmail(dot)com>
Cc:	Tomas Vondra <tomas(at)vondra(dot)me>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Parallel heap vacuum
Date:	2025-02-14 08:38:24
Message-ID:	CAD21AoD6XZG9gK6rdb9_Swot2L29xCxYM6psWgy+nXswK5h5mw@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Thu, Feb 13, 2025 at 8:16 PM John Naylor <johncnaylorls(at)gmail(dot)com> wrote:
>
> On Thu, Feb 13, 2025 at 5:37 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > During eager vacuum scan, we reset the eager_scan_remaining_fails
> > counter when we start to scan the new region. So if we want to make
> > parallel heap vacuum behaves exactly the same way as the
> > single-progress vacuum in terms of the eager vacuum scan, we would
> > need to have the eager_scan_remaining_fails counters for each region
> > so that the workers can decrement it corresponding to the region of
> > the block that the worker is scanning. But I'm concerned that it makes
> > the logic very complex. I'd like to avoid making newly introduced
> > codes more complex by adding yet another new code on top of that.
>
> Would it be simpler to make only phase III parallel?

Yes, I think so.

> In other words,
> how much of the infrastructure and complexity needed for parallel
> phase I is also needed for phase III?

Both phases need some common changes to the parallel vacuum
infrastructure so that we can launch parallel workers using
ParallelVacuumContext for phase I and III and parallel vacuum workers
can do its task on the heap table. Specifically, we need to change
vacuumparallel.c to work also in parallel heap vacuum case, and add
some table AM callbacks. Other than that, these phases have different
complexity. As for supporting parallelism for phase III, changes to
lazy vacuum would not be very complex, it needs to change both the
radix tree and TidStore to support shared iteration through.
Supporting parallelism of phase I is more complex since it integrates
parallel table scan with lazy heap scan.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Re: Parallel heap vacuum at 2025-02-14 04:16:06 from John Naylor

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Zhou, Zhiguo	2025-02-14 08:41:35	Re: [RFC] Lock-free XLog Reservation from WAL
Previous Message	Dmitry Dolgov	2025-02-14 08:36:08	Re: pg_stat_statements and "IN" conditions