From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [HACKERS] Block level parallel vacuum |
Date: | 2019-01-15 06:59:15 |
Message-ID: | CAD21AoBMRNQJAUwWzFO6wb2u25cMkAu48Q86iv_xVcp884LG0w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Dec 28, 2018 at 11:43 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Thu, Dec 20, 2018 at 3:38 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Tue, Dec 18, 2018 at 1:29 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> > > Attached the updated patches. I scaled back the scope of this patch.
> > > The patch now includes only feature (a), that is it execute both index
> > > vacuum and cleanup index in parallel. It also doesn't include
> > > autovacuum support for now.
> > >
> > > The PARALLEL option works alomst same as before patch. In VACUUM
> > > command, we can specify 'PARALLEL n' option where n is the number of
> > > parallel workers to request. If the n is omitted the number of
> > > parallel worekrs would be # of indexes -1.
> > >
> >
> > I think for now this is okay, but I guess in furture when we make
> > heapscans also parallel or maybe allow more than one worker per-index
> > vacuum, then this won't hold good. So, I am not sure if below text in
> > docs is most appropriate.
> >
> > + <term><literal>PARALLEL <replaceable
> > class="parameter">N</replaceable></literal></term>
> > + <listitem>
> > + <para>
> > + Execute index vacuum and cleanup index in parallel with
> > + <replaceable class="parameter">N</replaceable> background
> > workers. If the parallel
> > + degree <replaceable class="parameter">N</replaceable> is omitted,
> > + <command>VACUUM</command> requests the number of indexes - 1
> > processes, which is the
> > + maximum number of parallel vacuum workers since individual
> > indexes is processed by
> > + one process. The actual number of parallel vacuum workers may
> > be less due to the
> > + setting of <xref linkend="guc-max-parallel-workers-maintenance"/>.
> > + This option can not use with <literal>FULL</literal> option.
> >
> > It might be better to use some generic language in docs, something
> > like "If the parallel degree N is omitted, then vacuum decides the
> > number of workers based on number of indexes on the relation which is
> > further limited by max-parallel-workers-maintenance".
>
> Thank you for the review.
>
> I agreed your concern and the text you suggested.
>
> > I think you
> > also need to mention in some way that you consider storage option
> > parallel_workers.
>
> Added.
>
> >
> > Few assorted comments:
> > 1.
> > +lazy_begin_parallel_vacuum_index(LVState *lvstate, bool for_cleanup)
> > {
> > ..
> > +
> > + LaunchParallelWorkers(lvstate->pcxt);
> > +
> > + /*
> > + * if no workers launched, we vacuum all indexes by the leader process
> > + * alone. Since there is hope that we can launch workers in the next
> > + * execution time we don't want to end the parallel mode yet.
> > + */
> > + if (lvstate->pcxt->nworkers_launched == 0)
> > + return;
> >
> > It is quite possible that the workers are not launched because we fail
> > to allocate memory, basically when pcxt->nworkers is zero. I think in
> > such cases there is no use for being in parallel mode. You can even
> > detect that before calling lazy_begin_parallel_vacuum_index.
>
> Agreed. we can stop preparation and exit parallel mode if
> pcxt->nworkers got 0 after InitializeParallelDSM() .
>
> >
> > 2.
> > static void
> > +lazy_vacuum_all_indexes_for_leader(LVState *lvstate,
> > IndexBulkDeleteResult **stats,
> > + LVTidMap *dead_tuples, bool do_parallel,
> > + bool for_cleanup)
> > {
> > ..
> > + if (do_parallel)
> > + lazy_begin_parallel_vacuum_index(lvstate, for_cleanup);
> > +
> > + for (;;)
> > + {
> > + IndexBulkDeleteResult *r = NULL;
> > +
> > + /*
> > + * Get the next index number to vacuum and set index statistics. In parallel
> > + * lazy vacuum, index bulk-deletion results are stored in the shared memory
> > + * segment. If it's already updated we use it rather than setting it to NULL.
> > + * In single vacuum, we can always use an element of the 'stats'.
> > + */
> > + if (do_parallel)
> > + {
> > + idx = pg_atomic_fetch_add_u32(&(lvshared->nprocessed), 1);
> > +
> > + if (lvshared->indstats[idx].updated)
> > + r = &(lvshared->indstats[idx].stats);
> > + }
> >
> > It is quite possible that we are not able to launch any workers in
> > lazy_begin_parallel_vacuum_index, in such cases, we should not use
> > parallel mode path, basically as written we can't rely on
> > 'do_parallel' flag.
> >
>
> Fixed.
>
> Attached new version patch.
>
Rebased.
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
Attachment | Content-Type | Size |
---|---|---|
v11-0001-Add-parallel-option-to-VACUUM-command.patch | application/octet-stream | 73.0 KB |
v11-0002-Add-P-option-to-vacuumdb-command.patch | application/octet-stream | 5.8 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Haribabu Kommi | 2019-01-15 07:02:38 | Re: Pluggable Storage - Andres's take |
Previous Message | Dmitry Dolgov | 2019-01-15 06:58:55 | Re: Pluggable Storage - Andres's take |