Re: Parallel heap vacuum

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Melanie Plageman <melanieplageman(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, John Naylor <johncnaylorls(at)gmail(dot)com>, Tomas Vondra <tomas(at)vondra(dot)me>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel heap vacuum
Date: 2025-04-06 05:01:34
Message-ID: CAD21AoA9eJ0Qx=3h77__K5ssj8R3KoVY3Uw5P7vux8HmJMRKBg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Apr 5, 2025 at 1:32 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> Hi,
>
> On 2025-04-04 14:34:53 -0700, Masahiko Sawada wrote:
> > On Fri, Apr 4, 2025 at 11:05 AM Melanie Plageman
> > <melanieplageman(at)gmail(dot)com> wrote:
> > >
> > > On Tue, Apr 1, 2025 at 5:30 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > > >
> > > >
> > > > I've attached the new version patch. There are no major changes; I
> > > > fixed some typos, improved the comment, and removed duplicated codes.
> > > > Also, I've updated the commit messages.
> > >
> > > I haven't looked closely at this version but I did notice that you do
> > > not document that parallel vacuum disables eager scanning. Imagine you
> > > are a user who has set the eager freeze related table storage option
> > > (vacuum_max_eager_freeze_failure_rate) and you schedule a regular
> > > parallel vacuum. Now that table storage option does nothing.
> >
> > Good point. That restriction should be mentioned in the documentation.
> > I'll update the patch.
>
> I don't think we commonly accept that a new feature B regresses a pre-existing
> feature A, particularly not if feature B is enabled by default. Why would that
> be OK here?

The eager freeze scan is the pre-existing feature but it's pretty new
code that was pushed just a couple months ago. I didn't want to make
the newly introduced code complex further in one major release
especially if it's in a vacuum area. While I agree that disabling
eager freeze scans during parallel heap vacuum is not very
user-friendly, there are still many cases where parallel heap vacuum
helps even without the eager freeze scan. FYI the parallel heap scan
can be disabled by setting min_parallel_table_scan_size. So I thought
we can incrementally improve this part.

>
>
> The justification in the code:
> + * One might think that it would make sense to use the eager scanning even
> + * during parallel lazy vacuum, but parallel vacuum is available only in
> + * VACUUM command and would not be something that happens frequently,
> + * which seems not fit to the purpose of the eager scanning. Also, it
> + * would require making the code complex. So it would make sense to
> + * disable it for now.
>
> feels not at all convincing to me. There e.g. are lots of places that run
> nightly vacuums. I don't think it's ok to just disable eager scanning in such
> a case, as it would mean that the "freeze cliff" would end up being *higher*
> because of the nightly vacuums than if just plain autovacuum would have been
> used.

That's a fair argument.

> I think it was already a mistake to allow the existing vacuum parallelism to
> be introduced without integrating it with autovacuum. I don't think we should
> go further down that road.

Okay, I think we can consider how to proceed with this patch including
the above point in the v19 development.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2025-04-06 06:33:39 Re: rename pg_log_standby_snapshot
Previous Message Álvaro Herrera 2025-04-06 04:56:30 Re: Support NOT VALID / VALIDATE constraint options for named NOT NULL constraints