From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
---|---|
To: | Melanie Plageman <melanieplageman(at)gmail(dot)com> |
Cc: | Peter Smith <smithpb2250(at)gmail(dot)com>, John Naylor <johncnaylorls(at)gmail(dot)com>, Tomas Vondra <tomas(at)vondra(dot)me>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de> |
Subject: | Re: Parallel heap vacuum |
Date: | 2025-04-28 18:07:20 |
Message-ID: | CAD21AoC+enckFLHGeDYzHZ33GP_D2sGa=c45cf8kHGtoFj5vuA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat, Apr 5, 2025 at 1:17 PM Melanie Plageman
<melanieplageman(at)gmail(dot)com> wrote:
>
> On Fri, Apr 4, 2025 at 5:35 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > > I haven't looked closely at this version but I did notice that you do
> > > not document that parallel vacuum disables eager scanning. Imagine you
> > > are a user who has set the eager freeze related table storage option
> > > (vacuum_max_eager_freeze_failure_rate) and you schedule a regular
> > > parallel vacuum. Now that table storage option does nothing.
> >
> > Good point. That restriction should be mentioned in the documentation.
> > I'll update the patch.
>
> Yea, I mean, to be honest, when I initially replied to the thread
> saying I thought temporarily disabling eager scanning for parallel
> heap vacuuming was viable, I hadn't looked at the patch yet and
> thought that there was a separate way to enable the new parallel heap
> vacuum (separate from the parallel option for the existing parallel
> index vacuuming). I don't like that this disables functionality that
> worked when I pushed the eager scanning feature.
Thank you for sharing your opinion. I think this is one of the main
points that we need to deal with to complete this patch.
After considering various approaches to integrating parallel heap
vacuum and eager freeze scanning, one viable solution would be to
implement a dedicated parallel scan mechanism for parallel lazy scan,
rather than relying on the table_block_parallelscan_xxx() facility.
This approach would involve dividing the table into chunks of 4,096
blocks, same as eager freeze scanning, where each parallel worker
would perform eager freeze scanning while maintaining its own local
failure count and a shared success count. This straightforward
approach offers an additional advantage: since the chunk size remains
constant, we can implement the SKIP_PAGES_THRESHOLD optimization
consistently throughout the table, including its final sections.
However, this approach does present certain challenges. First, we
would need to maintain a separate implementation of lazy vacuum's
parallel scan alongside the table_block_parallelscan_XXX() facility,
potentially increasing maintenance overhead. Additionally, the fixed
chunk size across the entire table might prove less efficient when
processing blocks near the table's end compared to the dynamic
approach used by table_block_parallelscan_nextpage().
Feedback is very welcome.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | Sami Imseih | 2025-04-28 18:12:18 | Re: Disallow redundant indexes |
Previous Message | Alexander Lakhin | 2025-04-28 18:00:01 | Re: optimize file transfer in pg_upgrade |