From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
---|---|
To: | vignesh C <vignesh21(at)gmail(dot)com> |
Cc: | "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Parallel heap vacuum |
Date: | 2024-11-19 00:17:25 |
Message-ID: | CAD21AoC8T13bvR46hPXj2hjAnUam2OXR2qweXaHUnrDw=VCEuw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Nov 12, 2024 at 3:21 AM vignesh C <vignesh21(at)gmail(dot)com> wrote:
>
> On Wed, 30 Oct 2024 at 22:48, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> >
> > I've attached new version patches that fixes failures reported by
> > cfbot. I hope these changes make cfbot happy.
> >
>
> I just started reviewing the patch and found the following comments
> while going through the patch:
> 1) I felt we should add some documentation for this at [1].
>
> 2) Can we add some tests in vacuum_parallel with tables having no
> indexes and having dead tuples.
>
> 3) This should be included in typedefs.list:
> 3.a)
> +/*
> + * Relation statistics collected during heap scanning and need to be
> shared among
> + * parallel vacuum workers.
> + */
> +typedef struct LVRelScanStats
> +{
> + BlockNumber scanned_pages; /* # pages examined (not
> skipped via VM) */
> + BlockNumber removed_pages; /* # pages removed by relation
> truncation */
> + BlockNumber frozen_pages; /* # pages with newly frozen tuples */
>
> 3.b) Similarly this too:
> +/*
> + * Struct for information that need to be shared among parallel vacuum workers
> + */
> +typedef struct PHVShared
> +{
> + bool aggressive;
> + bool skipwithvm;
> +
>
> 3.c) Similarly this too:
> +/* Per-worker scan state for parallel heap vacuum scan */
> +typedef struct PHVScanWorkerState
> +{
> + bool initialized;
>
> 3.d) Similarly this too:
> +/* Struct for parallel heap vacuum */
> +typedef struct PHVState
> +{
> + /* Parallel scan description shared among parallel workers */
>
> 4) Since we are initializing almost all the members of structure,
> should we use palloc0 in this case:
> + scan_stats = palloc(sizeof(LVRelScanStats));
> + scan_stats->scanned_pages = 0;
> + scan_stats->removed_pages = 0;
> + scan_stats->frozen_pages = 0;
> + scan_stats->lpdead_item_pages = 0;
> + scan_stats->missed_dead_pages = 0;
> + scan_stats->nonempty_pages = 0;
> +
> + /* Initialize remaining counters (be tidy) */
> + scan_stats->tuples_deleted = 0;
> + scan_stats->tuples_frozen = 0;
> + scan_stats->lpdead_items = 0;
> + scan_stats->live_tuples = 0;
> + scan_stats->recently_dead_tuples = 0;
> + scan_stats->missed_dead_tuples = 0;
>
> 5) typo paralle should be parallel
> +/*
> + * Return the number of parallel workers for a parallel vacuum scan of this
> + * relation.
> + */
> +static inline int
> +table_paralle_vacuum_compute_workers(Relation rel, int requested)
> +{
> + return rel->rd_tableam->parallel_vacuum_compute_workers(rel, requested);
> +}
>
Thank you for the comments! I'll address these comments in the next
version patch.
BTW while updating the patch, I found that we might want to launch
different numbers of workers for scanning heap and vacuuming heap. The
number of parallel workers is determined based on the number of blocks
in the table. However, even if this number is high, it could happen
that we want to launch fewer workers to vacuum heap pages when there
are not many pages having garbage. And the number of workers for
vacuuming heap could vary on each vacuum pass. I'm considering
implementing it.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2024-11-19 00:21:17 | Re: memory leak in pgoutput |
Previous Message | Masahiko Sawada | 2024-11-19 00:02:15 | Re: Fix an error while building test_radixtree.c with TEST_SHARED_RT |