From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Peter Geoghegan <pg(at)bowt(dot)ie>, Andres Freund <andres(at)anarazel(dot)de> |
Subject: | Re: parallel vacuum comments |
Date: | 2021-11-24 01:36:41 |
Message-ID: | CAD21AoD=2gFBzzLg0-dV+_OCdGuxX-7yJcQC4mo-f2S=TR10Ng@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Nov 22, 2021 at 6:35 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Tue, Nov 16, 2021 at 11:23 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > I've incorporated these comments and attached an updated patch.
> >
>
> Review comments:
> ================
> 1.
> index_can_participate_parallel_vacuum()
> {
> ..
> + /*
> + * Not safe, if the index supports parallel cleanup conditionally,
> + * but we have already processed the index (for bulkdelete). See the
> + * comments for option VACUUM_OPTION_PARALLEL_COND_CLEANUP to know
> + * when indexes support parallel cleanup conditionally.
> + */
> + if (num_index_scans > 0 &&
> + ((vacoptions & VACUUM_OPTION_PARALLEL_COND_CLEANUP) != 0))
> ..
> }
>
> IIRC, we do this to avoid the need to invoke worker when parallel
> cleanup doesn't need to scan the index which means the work required
> to be performed by a worker would be minimal. If so, maybe we can
> write that in comments here or with
> VACUUM_OPTION_PARALLEL_COND_CLEANUP.
Right. Will add the comments.
> If the above understanding is correct then is it correct to check
> num_index_scans here? AFAICS, this value is incremented in
> parallel_vacuum_all_indexes irrespective of whether it is invoked for
> bulk delete or clean up. OTOH, previously, this was done based on
> first_time variable which was in turn set based on
> vacrel->num_index_scans and that is incremented in
> lazy_vacuum_all_indexes(both in serial and parallel case).
You're right. That's wrong to increment num_index_scans also when
vacuumcleanup. It should be incremented only when bulkdelete. Perhaps,
the caller (i.g., table AM) should pass num_index_scans to parallel
vacuum code? I initially thought that ParallelVacuumState can have
num_index_scans and increment it only when parallel bulk-deletion. But
if we do that we will end up having the same thing in two places:
ParallelVacuumState and LVRelState. It would be clearer if we maintain
num_index_scan in LVRelState and pass it to parallel index vacuum when
calling to parallel index bulk-deletion or cleanup. On the other hand,
the downside would be that there is a possibility that a table AM
passes the wrong num_index_scans. Probably it’s also a valid argument
that since if a table AM is capable of parallel index vacuum, it’s
better to outsource index bulkdelete/cleanup to parallel index vacuum
through a whole vacuum operation, it’d be better to have
ParallelVacuumState maintain num_index_scans.
>
> 2. The structure ParallelVacuumState contains both PVIndVacStatus and
> PVIndStats. Considering PVIndVacStatus is already present in
> PVIndStats, does ParallelVacuumState need to have both?
"PVIndVacStatus status” of ParallelVacuumState is used by the worker
in the error callback function,
parallel_index_vacuum_error_callback(), in order to know the status of
the index vacuum that the worker is working on. I think that without
PVIndVacStatus, the worker needs to have the index of the PVIndStats
array in order to get the status by like
errinfo->indstats[idx]->status. Do you prefer to do that?
> 3. Why ParallelVacuumCtl is declared in vacuum.h? It appears to be
> only used in one function begin_parallel_vacuum, can't we just declare
> in vacuumparallel.c?
ParallelVacuumCtl is a struct to begin the parallel vacuum and
therefore is expected to be passed by table AM. If we declare it in
vacuumparallel.c a table AM (e.g., vacuumlazy.c) cannot use it, no?
> As it is only required for one function and it is
> not that the number of individual parameters will be too huge, can't
> we do without having that structure.
Yes, we can do that without having that structure. I was a bit
concerned that there are already 7 arguments.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
From | Date | Subject | |
---|---|---|---|
Next Message | Andrew Dunstan | 2021-11-24 02:00:53 | Re: Mop-up from Test::More version change patch |
Previous Message | Andres Freund | 2021-11-24 01:32:25 | Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations |