Re: parallel vacuum comments

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Peter Geoghegan <pg(at)bowt(dot)ie>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: parallel vacuum comments
Date: 2021-12-16 13:38:12
Message-ID: CAA4eK1LiTabc-uNph4yKH8_3mEfafTJGmCC0jQx9YdnaPtVDkg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Dec 16, 2021 at 6:13 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Thu, Dec 16, 2021 at 1:56 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Wed, Dec 15, 2021 at 1:33 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> > > I've attached an updated patch. The patch incorporated several changes
> > > from the last version:
> > >
> > > * Rename parallel_vacuum_begin() to parallel_vacuum_init()
> > > * Unify the terminology; use "index bulk-deletion" and "index cleanup"
> > > instead of "index vacuum" and "index cleanup".
> > >
> >
> > I am not sure it is a good idea to do this as part of the main patch
> > as the intention of that is to just refactor parallel vacuum code. I
> > suggest doing this as a separate patch.
>
> Okay.
>
> > Also, can we move the common
> > code to be shared between vacuumparallel.c and vacuumlazy.c as a
> > separate patch?
>
> You mean vac_tid_reaped() and vac_cmp_itemptr() etc.? If so, do both
> vacuumparallel.c and vacuumlazy.c have the same functions?
>

Why that would be required? I think both can call the common exposed
function like the one you have in your patch bulkdel_one_index or if
we directly move lazy_vacuum_one_index as part of common code. Similar
for cleanup function.

> >
> > Few other comments and questions:
> > ============================
> > 1. /* Outsource everything to parallel variant */
> > - parallel_vacuum_process_all_indexes(vacrel, true);
> > + LVSavedErrInfo saved_err_info;
> > +
> > + /*
> > + * Outsource everything to parallel variant. Since parallel vacuum will
> > + * set the error context on an error we temporarily disable setting our
> > + * error context.
> > + */
> > + update_vacuum_error_info(vacrel, &saved_err_info,
> > + VACUUM_ERRCB_PHASE_UNKNOWN,
> > + InvalidBlockNumber, InvalidOffsetNumber);
> > +
> > + parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, vacrel->old_live_tuples);
> > +
> > + /* Revert to the previous phase information for error traceback */
> > + restore_vacuum_error_info(vacrel, &saved_err_info);
> >
> > Is this change because you want a separate error callback for parallel
> > vacuum? If so, I suggest we can discuss this as a separate patch from
> > the refactoring patch.
>
> Because it seems natural to me that the leader and worker use the same
> error callback.
>
> Okay, I'll remove that change in the next version patch.
>
> > 2. Is introducing bulkdel_one_index/cleanup_one_index related to new
> > error context, or "Unify the terminology" task? Is there any other
> > reason for the same?
>
> Because otherwise both vacuumlazy.c and vacuumparallel.c will have the
> same functions.
>
> >
> > 3. Why did you introduce
> > parallel_vacuum_bulkdel_all_indexes()/parallel_vacuum_cleanup_all_indexes()?
> > Is it because of your task "Unify the terminology"?
>
> This is because parallel bulk-deletion and cleanup require different
> numbers of inputs (num_table_tuples etc.) and the caller
> (vacuumlazy.c) cannot set them directly to ParallelVacuumState.
>

oh, yeah, the other possibility could be to have a common structure
that can be used for both cases. I am not sure if that is better than
what you have.

> >
> > 4.
> > @@ -3086,7 +2592,6 @@ lazy_cleanup_one_index(Relation indrel,
> > IndexBulkDeleteResult *istat,
> > ivinfo.report_progress = false;
> > ivinfo.estimated_count = estimated_count;
> > ivinfo.message_level = elevel;
> > -
> > ivinfo.num_heap_tuples = reltuples;
> >
> > This seems like an unrelated change.
>
> Yes, but I think it's an unnecessary break so we can change it
> together. Should it be done in a separate patch?
>

Isn't this just spurious line removal which shouldn't be part of any patch?

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2021-12-16 14:54:30 Re: logical decoding and replication of sequences
Previous Message osumi.takamichi@fujitsu.com 2021-12-16 13:21:22 RE: Failed transaction statistics to measure the logical replication progress