From: | Peter Geoghegan <pg(at)bowt(dot)ie> |
---|---|
To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Subject: | Re: parallel vacuum comments |
Date: | 2021-11-01 20:57:01 |
Message-ID: | CAH2-WzmFGpJ1Oep+P6LKWLYi1ZYXpQT8pAXLSpYF4_4OR6p5-w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Nov 1, 2021 at 5:47 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> For discussion, I've written a patch only for adding some tests to
> parallel vacuum. The test includes the reported case where small
> indexes are not processed by the leader process as well as cases where
> different kinds of indexes (i.g., different amparallelvacuumoptions)
> are vacuumed or cleaned up.
I started looking at this because I want to commit something like it
alongside a fix to bug #17245.
While I tend to favor relatively heavy assertions (e.g., the
heap_page_is_all_visible() related asserts I added to
lazy_scan_prune()), the idea of having a whole shared memory area just
for assertions seems a bit too much, even to me. I tried to simplify
it by just adding a new field to LVSharedIndStats, which seemed more
natural. It took me at least 15 minutes before I realized that I was
actually repeating exactly the same mistake that led to bug #17245 in
the first place. I somehow forgot that
parallel_stats_for_idx()/IndStatsIsNull() will return NULL for any
index that has already been deemed too small to be worth processing in
parallel. Even after all that drama!
Rather than inventing PARALLEL_VACUUM_KEY_INDVAC_CHECK (just for
assert-enabled builds), we should invent PARALLEL_VACUUM_STATS -- a
dedicated shmem area for the array of LVSharedIndStats (no more
storing LVSharedIndStats entries at the end of the LVShared space in
an ad-hoc, type unsafe way). There should be one array element for
each and every index -- even those indexes where parallel index
vacuuming is unsafe or not worthwhile (unsure if avoiding parallel
processing for "not worthwhile" indexes actually makes sense, BTW). We
can then get rid of the bitmap/IndStatsIsNull() stuff entirely. We'd
also add new per-index state fields to LVSharedIndStats itself. We
could directly record the status of each index (e.g., parallel unsafe,
amvacuumcleanup processing done, ambulkdelete processing done)
explicitly. All code could safely subscript the LVSharedIndStats array
directly, using idx style integers. That seems far more robust and
consistent.
I think that this PARALLEL_VACUUM_STATS refactoring is actually the
simplest way to comprehensively test parallel VACUUM. I will still
need to add tests for my fix to bug #17245, but they won't be truly
general tests. I'll have to make sure that one of the assertions in
nbtdedup.c fails when the tests are run without the fix in place, or
something like that.
--
Peter Geoghegan
From | Date | Subject | |
---|---|---|---|
Next Message | Stephen Frost | 2021-11-01 21:00:31 | Re: Delegating superuser tasks to new security roles (Was: Granting control of SUSET gucs to non-superusers) |
Previous Message | Thomas Munro | 2021-11-01 20:56:10 | Re: Portability report: ninja |