Re: Summarizing indexes allowing single-phase VACUUM?

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Summarizing indexes allowing single-phase VACUUM?
Date: 2025-04-25 17:07:32
Message-ID: CAD21AoBta=4GHkc5a3QNyHUWU-47SvC6SpV7c3u0Dxaajo6rHg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 9, 2025 at 2:47 PM Matthias van de Meent
<boekewurm+postgres(at)gmail(dot)com> wrote:
>
> Hi,
>
> With PG16, we got Index AM -level indications for summarizing indexes,
> improving HOT applicability.
>
> Given that these indexes explicitly never store TIDs and thus don't
> really need the cleanup part of vacuum's index cleanup, we could well
> decide to not count those indexes to the indexes we need to clean up
> in the index scan phase of vacuum, thus allowing tables with only
> summarizing indexes to use the single-scan option of vacuum.
>
> As a summarizing index may still want to do some housekeeping during
> vacuum, it'd still need to get its cleanup functions called, but it'd
> at least save the io and WAL of the cleanup scan.
>
> Any comments/suggestions? POC patch attached.

I like this idea.

Here are some comments on the PoC patch:

- if (vacrel->nindexes == 0
+ if (vacrel->indallsummarizing

I'm not a fan of this change as it isn't obvious from the name
indallsummarizing that the table doesn't have any indexes. I think we
can keep nindexes==0 or rename indallsummarizing to something like
use_onepass_strategy if we want to use merge them.

---
@@ -2536,8 +2544,12 @@ lazy_vacuum(LVRelState *vacrel)
/*
* We successfully completed a round of index vacuuming. Do related
* heap vacuuming now.
+ *
+ * If all valid indexes are summarizing, then the TIDs have already
+ * been reclaimed, requiring us to skip that last phase.
*/
- lazy_vacuum_heap_rel(vacrel);
+ if (!vacrel->indallsummarizing)
+ lazy_vacuum_heap_rel(vacrel);

IIUC if indallsummarizing is true we should never reach lazy_vacuum().

---
@@ -241,8 +242,9 @@ static void parallel_vacuum_error_callback(void *arg);
*/
ParallelVacuumState *
parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
- int nrequested_workers, int vac_work_mem,
- int elevel, BufferAccessStrategy bstrategy)
+ bool indallsummarizing, int nrequested_workers,
+ int vac_work_mem, int elevel,
+ BufferAccessStrategy bstrategy)
{
ParallelVacuumState *pvs;
ParallelContext *pcxt;
@@ -282,6 +284,7 @@ parallel_vacuum_init(Relation rel, Relation
*indrels, int nindexes,
pvs = (ParallelVacuumState *) palloc0(sizeof(ParallelVacuumState));
pvs->indrels = indrels;
pvs->nindexes = nindexes;
+ pvs->indallsummarizing = nindexes;
pvs->will_parallel_vacuum = will_parallel_vacuum;
pvs->bstrategy = bstrategy;
pvs->heaprel = rel;

We should set pvs->indallsummarizing to indallsummarizing, not
nindexes, but ISTM we don't use pvs->indallsummazing anywhere in
vacuumparallel.c. Do we need to pass indallsummarizing to
parallel_vacuum_init() in the first place?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2025-04-25 17:16:24 Re: gcc 15.1 warnings - jsonb_util.c
Previous Message Andres Freund 2025-04-25 16:44:14 Re: AIO v2.5