Re: [HACKERS] Block level parallel vacuum

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Robert Haas <robertmhaas(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, David Steele <david(at)pgmasters(dot)net>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Block level parallel vacuum
Date: 2019-10-07 11:21:03
Message-ID: CAD21AoC6GeT2=hG0QveB3e3bA0dF=jGrFZ3Vu=bcjsXBsdA3tA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Oct 4, 2019 at 7:05 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Fri, Oct 4, 2019 at 11:01 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Fri, Oct 4, 2019 at 10:28 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >>
> Some more comments..

Thank you!

> 1.
> + for (idx = 0; idx < nindexes; idx++)
> + {
> + if (!for_cleanup)
> + lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
> + vacrelstats->old_live_tuples);
> + else
> + {
> + /* Cleanup one index and update index statistics */
> + lazy_cleanup_index(Irel[idx], &stats[idx], vacrelstats->new_rel_tuples,
> + vacrelstats->tupcount_pages < vacrelstats->rel_pages);
> +
> + lazy_update_index_statistics(Irel[idx], stats[idx]);
> +
> + if (stats[idx])
> + pfree(stats[idx]);
> + }
>
> I think instead of checking for_cleanup variable for every index of
> the loop we better move loop inside, like shown below?

Fixed.

>
> if (!for_cleanup)
> for (idx = 0; idx < nindexes; idx++)
> lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
> else
> for (idx = 0; idx < nindexes; idx++)
> {
> lazy_cleanup_index
> lazy_update_index_statistics
> ...
> }
>
> 2.
> +static void
> +lazy_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel,
> + int nindexes, IndexBulkDeleteResult **stats,
> + LVParallelState *lps, bool for_cleanup)
> +{
> + int idx;
> +
> + Assert(!IsParallelWorker());
> +
> + /* no job if the table has no index */
> + if (nindexes <= 0)
> + return;
>
> Wouldn't it be good idea to call this function only if nindexes > 0?
>

I realized the callers of this function should pass nindexes > 0
because they attempt to do index vacuuming or index cleanup. So it
should be an assertion rather than returning. Thoughts?

> 3.
> +/*
> + * Vacuum or cleanup indexes with parallel workers. This function must be used
> + * by the parallel vacuum leader process.
> + */
> +static void
> +lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats,
> Relation *Irel,
> + int nindexes, IndexBulkDeleteResult **stats,
> + LVParallelState *lps, bool for_cleanup)
>
> If you see this function there is no much common code between
> for_cleanup and without for_cleanup except these 3-4 statement.
> LaunchParallelWorkers(lps->pcxt);
> /* Create the log message to report */
> initStringInfo(&buf);
> ...
> /* Wait for all vacuum workers to finish */
> WaitForParallelWorkersToFinish(lps->pcxt);
>
> Other than that you have got a lot of checks like this
> + if (!for_cleanup)
> + {
> + }
> + else
> + {
> }
>
> I think code would be much redable if we have 2 functions one for
> vaccum (lazy_parallel_vacuum_indexes) and another for
> cleanup(lazy_parallel_cleanup_indexes).

Seems good idea. Fixed.

>
> 4.
> * of index scans performed. So we don't use maintenance_work_mem memory for
> * the TID array, just enough to hold as many heap tuples as fit on one page.
> *
> + * Lazy vacuum supports parallel execution with parallel worker processes. In
> + * parallel lazy vacuum, we perform both index vacuuming and index cleanup with
> + * parallel worker processes. Individual indexes are processed by one vacuum
>
> Spacing after the "." is not uniform, previous comment is using 2
> space and newly
> added is using 1 space.
>

FIxed.

The code has been fixed in my local repository. After incorporated the
all comments I got so far I'll submit the updated version patch.

Regards,

--
Masahiko Sawada

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexey Kondratov 2019-10-07 12:31:45 Re: Two pg_rewind patches (auto generate recovery conf and ensure clean shutdown)
Previous Message Nikolay Shaplov 2019-10-07 09:42:39 Re: [PATCH] use separate PartitionedRelOptions structure to store partitioned table options