Re: Eagerly scan all-visible pages to amortize aggressive vacuum

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Peter Geoghegan <pg(at)bowt(dot)ie>, Robert Treat <rob(at)xzilla(dot)net>
Subject: Re: Eagerly scan all-visible pages to amortize aggressive vacuum
Date: 2025-01-13 21:46:47
Message-ID: CAAKRu_bEZHP6NhFZE4Ke4S036he+mvsK9ed5VeR2WYaZYqB+QQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 9, 2025 at 1:24 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> On 2025-01-07 15:46:26 -0500, Melanie Plageman wrote:
> > For table storage options, those related to vacuum but not autovacuum
> > are in the main StdRdOptions struct. Of those, some are overridden by
> > VACUUM command parameters which are parsed out into the VacuumParams
> > struct. Though the members of VacuumParams are initialized in
> > ExecVacuum(), the storage parameter overrides are determined in
> > vacuum_rel() and the final value goes in the VacuumParams struct which
> > is passed all the way through to heap_vacuum_rel().
> >
> > Because VacuumParams is what ultimately gets passed down to the
> > table-AM specific vacuum implementation, autovacuum also initializes
> > its own instance of VacuumParams in the autovac_table struct in
> > table_recheck_autovac() (even though no VACUUM command parameters can
> > affect autovacuum). These are overridden in vacuum_rel() as well.
> >
> > Ultimately vacuum_eager_scan_max_fails is a bit different from the
> > existing members of VacuumParams and StdRdOptions. It is a GUC and a
> > table storage option but not a SQL command parameter -- and both the
> > GUC and the table storage parameter affect both vacuum and autovacuum.
> > And it doesn't need to be initialized in different ways for autovacuum
> > and vacuum. In the end, I decided to follow the existing conventions
> > as closely as I could.
>
> I think that's fine. The abstractions in this area aren't exactly perfect, and
> I don't think this makes it worse in any meaningful way. It's not really
> different from having other heap-specific params like freeze_min_age in
> VacuumParams.

Got it. I've left it as is, then.

Attached v6 is rebased over recent changes in the vacuum-related docs.
I've also updated the "Routine Vacuuming" section of the docs to
mention eager scanning.

I'm planning to commit 0001 (which updates the code comment at the top
of vacuumlazy.c to explain heap vacuuming) --barring any objections.

I've been running a few multi-day benchmarks to ensure that the patch
behaves the same in a "normal" timeframe as it did in a compressed
one.

So far, it looks good. For a multi-day transactional benchmark with a
gaussian data access pattern, it looks about the same as a shorter
version (that is, aggressive vacuums are much shorter and there is no
difference when compared to master WRT total WAL volume, TPS, etc).

The final long benchmarks I'm waiting on are a hot tail workload with
a job that deletes old data.

- Melanie

Attachment Content-Type Size
v6-0002-Eagerly-scan-all-visible-pages-to-amortize-aggres.patch text/x-patch 39.3 KB
v6-0001-Add-more-general-summary-to-vacuumlazy.c.patch text/x-patch 3.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Giampaolo Capelli 2025-01-13 21:47:35 question about relation_open
Previous Message Andres Freund 2025-01-13 21:46:13 Re: AIO v2.2