Quick Links

Re: Parallel heap vacuum

From:	Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To:	Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc:	Peter Smith <smithpb2250(at)gmail(dot)com>, John Naylor <johncnaylorls(at)gmail(dot)com>, Tomas Vondra <tomas(at)vondra(dot)me>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject:	Re: Parallel heap vacuum
Date:	2025-04-28 18:07:20
Message-ID:	CAD21AoC+enckFLHGeDYzHZ33GP_D2sGa=c45cf8kHGtoFj5vuA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Sat, Apr 5, 2025 at 1:17 PM Melanie Plageman
<melanieplageman(at)gmail(dot)com> wrote:
>
> On Fri, Apr 4, 2025 at 5:35 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > > I haven't looked closely at this version but I did notice that you do
> > > not document that parallel vacuum disables eager scanning. Imagine you
> > > are a user who has set the eager freeze related table storage option
> > > (vacuum_max_eager_freeze_failure_rate) and you schedule a regular
> > > parallel vacuum. Now that table storage option does nothing.
> >
> > Good point. That restriction should be mentioned in the documentation.
> > I'll update the patch.
>
> Yea, I mean, to be honest, when I initially replied to the thread
> saying I thought temporarily disabling eager scanning for parallel
> heap vacuuming was viable, I hadn't looked at the patch yet and
> thought that there was a separate way to enable the new parallel heap
> vacuum (separate from the parallel option for the existing parallel
> index vacuuming). I don't like that this disables functionality that
> worked when I pushed the eager scanning feature.

Thank you for sharing your opinion. I think this is one of the main
points that we need to deal with to complete this patch.

After considering various approaches to integrating parallel heap
vacuum and eager freeze scanning, one viable solution would be to
implement a dedicated parallel scan mechanism for parallel lazy scan,
rather than relying on the table_block_parallelscan_xxx() facility.
This approach would involve dividing the table into chunks of 4,096
blocks, same as eager freeze scanning, where each parallel worker
would perform eager freeze scanning while maintaining its own local
failure count and a shared success count. This straightforward
approach offers an additional advantage: since the chunk size remains
constant, we can implement the SKIP_PAGES_THRESHOLD optimization
consistently throughout the table, including its final sections.

However, this approach does present certain challenges. First, we
would need to maintain a separate implementation of lazy vacuum's
parallel scan alongside the table_block_parallelscan_XXX() facility,
potentially increasing maintenance overhead. Additionally, the fixed
chunk size across the entire table might prove less efficient when
processing blocks near the table's end compared to the dynamic
approach used by table_block_parallelscan_nextpage().

Feedback is very welcome.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Re: Parallel heap vacuum at 2025-04-05 20:16:56 from Melanie Plageman

Responses

Re: Parallel heap vacuum at 2025-06-13 20:05:55 from Masahiko Sawada

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Sami Imseih	2025-04-28 18:12:18	Re: Disallow redundant indexes
Previous Message	Alexander Lakhin	2025-04-28 18:00:01	Re: optimize file transfer in pg_upgrade