Quick Links

Re: [HACKERS] Block level parallel vacuum

From:	Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To:	Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
Cc:	Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Mahendra Singh <mahi6run(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Robert Haas <robertmhaas(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, David Steele <david(at)pgmasters(dot)net>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [HACKERS] Block level parallel vacuum
Date:	2019-11-13 03:42:57
Message-ID:	CAFiTN-vY2pp7ujO+UPZ=9HX=8Ltuf+24BWmGu9PvgZBB2q3XcA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, Nov 12, 2019 at 5:31 PM Masahiko Sawada
<masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
>
> On Tue, 12 Nov 2019 at 20:29, Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> >
> > On Tue, Nov 12, 2019 at 4:04 PM Masahiko Sawada
> > <masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
> > >
> > > On Mon, 11 Nov 2019 at 17:57, Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> > > >
> > > > On Tue, Oct 29, 2019 at 12:37 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > > > > I realized that v31-0006 patch doesn't work fine so I've attached the
> > > > > updated version patch that also incorporated some comments I got so
> > > > > far. Sorry for the inconvenience. I'll apply your 0001 patch and also
> > > > > test the total delay time.
> > > > >
> > > > While reviewing the 0002, I got one doubt related to how we are
> > > > dividing the maintainance_work_mem
> > > >
> > > > +prepare_index_statistics(LVShared *lvshared, Relation *Irel, int nindexes)
> > > > +{
> > > > + /* Compute the new maitenance_work_mem value for index vacuuming */
> > > > + lvshared->maintenance_work_mem_worker =
> > > > + (nindexes_mwm > 0) ? maintenance_work_mem / nindexes_mwm :
> > > > maintenance_work_mem;
> > > > +}
> > > > Is it fair to just consider the number of indexes which use
> > > > maintenance_work_mem? Or we need to consider the number of worker as
> > > > well. My point is suppose there are 10 indexes which will use the
> > > > maintenance_work_mem but we are launching just 2 workers then what is
> > > > the point in dividing the maintenance_work_mem by 10.
> > > >
> > > > IMHO the calculation should be like this
> > > > lvshared->maintenance_work_mem_worker = (nindexes_mwm > 0) ?
> > > > maintenance_work_mem / Min(nindexes_mwm, nworkers) :
> > > > maintenance_work_mem;
> > > >
> > > > Am I missing something?
> > >
> > > No, I think you're right. On the other hand I think that dividing it
> > > by the number of indexes that will use the mantenance_work_mem makes
> > > sense when parallel degree > the number of such indexes. Suppose the
> > > table has 2 indexes and there are 10 workers then we should divide the
> > > maintenance_work_mem by 2 rather than 10 because it's possible that at
> > > most 2 indexes that uses the maintenance_work_mem are processed in
> > > parallel at a time.
> > >
> > Right, thats the reason I suggested divide with Min(nindexes_mwm, nworkers).
>
> Thanks! I'll fix it in the next version patch.
>
One more comment.

+lazy_vacuum_indexes(LVRelStats *vacrelstats, Relation *Irel,
+ int nindexes, IndexBulkDeleteResult **stats,
+ LVParallelState *lps)
+{
+ ....

+ if (ParallelVacuumIsActive(lps))
+ {

+
+ lazy_parallel_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
+ stats, lps);
+
+ }
+
+ for (idx = 0; idx < nindexes; idx++)
+ {
+ /*
+ * Skip indexes that we have already vacuumed during parallel index
+ * vacuuming.
+ */
+ if (ParallelVacuumIsActive(lps) && !IndStatsIsNull(lps->lvshared, idx))
+ continue;
+
+ lazy_vacuum_index(Irel[idx], &stats[idx], vacrelstats->dead_tuples,
+ vacrelstats->old_live_tuples);
+ }
+}

In this function, if ParallelVacuumIsActive, we perform the parallel
vacuum for all the index for which parallel vacuum is supported and
once that is over we finish vacuuming remaining indexes for which
parallel vacuum is not supported. But, my question is that inside
lazy_parallel_vacuum_or_cleanup_indexes, we wait for all the workers
to finish their job then only we start with the sequential vacuuming
shouldn't we start that immediately as soon as the leader
participation is over in the parallel vacuum?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Re: [HACKERS] Block level parallel vacuum at 2019-11-12 12:01:07 from Masahiko Sawada

Responses

Re: [HACKERS] Block level parallel vacuum at 2019-11-13 04:21:26 from Dilip Kumar
Re: [HACKERS] Block level parallel vacuum at 2019-11-13 06:08:27 from Masahiko Sawada

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Smith, Peter	2019-11-13 03:51:05	RE: New SQL counter statistics view (pg_stat_sql)
Previous Message	Smith, Peter	2019-11-13 03:23:06	RE: Proposal: Add more compile-time asserts to expose inconsistencies.