Re: [HACKERS] Block level parallel vacuum

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Mahendra Singh <mahi6run(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Robert Haas <robertmhaas(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, David Steele <david(at)pgmasters(dot)net>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Block level parallel vacuum
Date: 2019-11-12 08:55:19
Message-ID: CAA4eK1KfAXWNWtJi+gj9Yehr4tMptKk1dM+eOE=cBQqj3yeR4Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Nov 12, 2019 at 7:43 AM Masahiko Sawada
<masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
>
> On Mon, 11 Nov 2019 at 19:29, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Mon, Nov 11, 2019 at 12:26 PM Masahiko Sawada
> > <masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
> > >
> > > After more thoughts, I think we can have a ternary value: never,
> > > always, once. If it's 'never' the index never participates in parallel
> > > cleanup. I guess hash indexes use 'never'. Next, if it's 'always' the
> > > index always participates regardless of vacrelstats->num_index_scan. I
> > > guess gin, brin and bloom use 'always'. Finally if it's 'once' the
> > > index participates in parallel cleanup only when it's the first time
> > > (that is, vacrelstats->num_index_scan == 0), I guess btree, gist and
> > > spgist use 'once'.
> > >
> >
> > I think this 'once' option is confusing especially because it also
> > depends on 'num_index_scans' which the IndexAM has no control over.
> > It might be that the option name is not good, but I am not sure.
> > Another thing is that for brin indexes, we don't want bulkdelete to
> > participate in parallelism.
>
> I thought brin should set amcanparallelvacuum is false and
> amcanparallelcleanup is 'always'.
>

In that case, it is better to name the variable as amcanparallelbulkdelete.

> > Do we want to have separate variables for
> > ambulkdelete and amvacuumcleanup which decides whether the particular
> > phase can be done in parallel?
>
> You mean adding variables to ambulkdelete and amvacuumcleanup as
> function arguments?
>

No, I mean separate variables amcanparallelbulkdelete (bool) and
amcanparallelvacuumcleanup (unit16) variables.

>
> > Another possibility could be to just
> > have one variable (say uint16 amparallelvacuum) which will tell us all
> > the options but I don't think that will be a popular approach
> > considering all the other methods and variables exposed. What do you
> > think?
>
> Adding only one variable that can have flags would also be a good
> idea, instead of having multiple variables for each option. For
> instance FDW API uses such interface (see eflags of BeginForeignScan).
>

Yeah, maybe something like amparallelvacuumoptions. The options can be:

VACUUM_OPTION_NO_PARALLEL 0 # vacuum (neither bulkdelete nor
vacuumcleanup) can't be performed in parallel
VACUUM_OPTION_NO_PARALLEL_CLEANUP 1 # vacuumcleanup cannot be
performed in parallel (hash index will set this flag)
VACUUM_OPTION_PARALLEL_BULKDEL 2 # bulkdelete can be done in
parallel (Indexes nbtree, hash, gin, gist, spgist, bloom will set this
flag)
VACUUM_OPTION_PARALLEL_COND_CLEANUP 3 # vacuumcleanup can be done in
parallel if bulkdelete is not performed (Indexes nbtree, brin, hash,
gin, gist, spgist, bloom will set this flag)
VACUUM_OPTION_PARALLEL_CLEANUP 4 # vacuumcleanup can be done in
parallel even if bulkdelete is already performed (Indexes gin, brin,
and bloom will set this flag)

Does something like this make sense? If we all agree on this, then I
think we can summarize the part of the discussion related to this API
and get feedback from a broader audience.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2019-11-12 09:03:00 Re: Monitoring disk space from within the server
Previous Message Fujii Masao 2019-11-12 08:53:02 Re: pg_waldump and PREPARE