From: | Dilip Kumar <dilipbalaut(at)gmail(dot)com> |
---|---|
To: | Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com> |
Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Mahendra Singh <mahi6run(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Robert Haas <robertmhaas(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, David Steele <david(at)pgmasters(dot)net>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [HACKERS] Block level parallel vacuum |
Date: | 2019-11-11 07:35:22 |
Message-ID: | CAFiTN-trDu37W9u76oTQ+RGUVFoQ3zEvHdRPhXE3S_sWpTzkyw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Nov 11, 2019 at 12:26 PM Masahiko Sawada
<masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
>
> On Mon, 11 Nov 2019 at 15:06, Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> >
> > On Mon, Nov 11, 2019 at 9:57 AM Masahiko Sawada
> > <masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
> > >
> > > On Fri, 8 Nov 2019 at 18:48, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > >
> > > > On Tue, Oct 29, 2019 at 12:37 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > > > >
> > > > > I realized that v31-0006 patch doesn't work fine so I've attached the
> > > > > updated version patch that also incorporated some comments I got so
> > > > > far. Sorry for the inconvenience. I'll apply your 0001 patch and also
> > > > > test the total delay time.
> > > > >
> > > >
> > > > + /*
> > > > + * Generally index cleanup does not scan the index when index
> > > > + * vacuuming (ambulkdelete) was already performed. So we perform
> > > > + * index cleanup with parallel workers only if we have not
> > > > + * performed index vacuuming yet. Otherwise, we do it in the
> > > > + * leader process alone.
> > > > + */
> > > > + if (vacrelstats->num_index_scans == 0)
> > > > + lazy_parallel_vacuum_or_cleanup_indexes(vacrelstats, Irel, nindexes,
> > > > + stats, lps);
> > > >
> > > > Today, I was thinking about this point where this check will work for
> > > > most cases, but still, exceptions are there like for brin index, the
> > > > main work is done in amvacuumcleanup function. Similarly, I think
> > > > there are few more indexes like gin, bloom where it seems we take
> > > > another pass over-index in the amvacuumcleanup phase. Don't you think
> > > > we should try to allow parallel workers for such cases? If so, I
> > > > don't have any great ideas on how to do that, but what comes to my
> > > > mind is to indicate that via stats (
> > > > IndexBulkDeleteResult) or via an indexam API. I am not sure if it is
> > > > acceptable to have indexam API for this.
> > > >
> > > > Thoughts?
> > >
> > > Good point. gin and bloom do a certain heavy work during cleanup and
> > > during bulkdelete as you mentioned. Brin does it during cleanup, and
> > > hash and gist do it during bulkdelete. There are three types of index
> > > AM just inside postgres code. An idea I came up with is that we can
> > > control parallel vacuum and parallel cleanup separately. That is,
> > > adding a variable amcanparallelcleanup and we can do parallel cleanup
> > > on only indexes of which amcanparallelcleanup is true. IndexBulkDelete
> > > can be stored locally if both amcanparallelvacuum and
> > > amcanparallelcleanup of an index are false because only the leader
> > > process deals with such indexes. Otherwise we need to store it in DSM
> > > as always.
> > >
> > IIUC, amcanparallelcleanup will be true for those indexes which does
> > heavy work during cleanup irrespective of whether bulkdelete is called
> > or not e.g. gin?
>
> Yes, I guess that gin and brin set amcanparallelcleanup to true (gin
> might set amcanparallevacuum to true as well).
>
> > If so, along with an amcanparallelcleanup flag, we
> > need to consider vacrelstats->num_index_scans right? So if
> > vacrelstats->num_index_scans == 0 then we need to launch parallel
> > worker for all the indexes who support amcanparallelvacuum and if
> > vacrelstats->num_index_scans > 0 then only for those who has
> > amcanparallelcleanup as true.
>
> Yes, you're right. But this won't work fine for brin indexes who don't
> want to participate in parallel vacuum but always want to participate
> in parallel cleanup.
Yeah, right.
>
> After more thoughts, I think we can have a ternary value: never,
> always, once. If it's 'never' the index never participates in parallel
> cleanup. I guess hash indexes use 'never'. Next, if it's 'always' the
> index always participates regardless of vacrelstats->num_index_scan. I
> guess gin, brin and bloom use 'always'. Finally if it's 'once' the
> index participates in parallel cleanup only when it's the first time
> (that is, vacrelstats->num_index_scan == 0), I guess btree, gist and
> spgist use 'once'.
Yeah, this make sense to me.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2019-11-11 07:51:11 | Re: 'Invalid lp' during heap_xlog_delete |
Previous Message | Ashutosh Sharma | 2019-11-11 07:30:00 | Re: tableam vs. TOAST |