From: | Melanie Plageman <melanieplageman(at)gmail(dot)com> |
---|---|
To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Cc: | Peter Smith <smithpb2250(at)gmail(dot)com>, John Naylor <johncnaylorls(at)gmail(dot)com>, Tomas Vondra <tomas(at)vondra(dot)me>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de> |
Subject: | Re: Parallel heap vacuum |
Date: | 2025-03-26 20:00:38 |
Message-ID: | CAAKRu_aS_k751daj0QfO+5ZgLDz8LP1Wo_b5=RRKt6=f6OStLg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Mar 24, 2025 at 7:58 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> You're right. I've studied the read stream code and figured out how to
> use it. In the attached patch, we end the read stream at the end of
> phase 1 and start a new read stream, as you suggested.
I've started looking at this patch set some more.
In heap_vac_scan_next_block() if we are in the SKIP_PAGES_THRESHOLD
codepath and run out of unskippable blocks in the current chunk and
then go back to get another chunk (goto retry) but we are near the
memory limit so we can't get another block
(!dead_items_check_memory_limit()), could we get an infinite loop?
Or even incorrectly encroach on another worker's block? Asking that
because of this math
end_block = next_block +
vacrel->plvstate->scanworker->pbscanwork.phsw_chunk_remaining + 1;
if vacrel->plvstate->scanworker->pbscanwork.phsw_chunk_remaining is 0
and we are in the goto retry loop, it seems like we could keep
incrementing next_block even when we shouldn't be.
I just want to make sure that the skip pages optimization works with
the parallel block assignment and the low memory read stream
wind-down.
I also think you do not need to expose
table_block_parallelscan_skip_pages_in_chunk() in the table AM. It is
only called in heap-specific code and the logic seems very
heap-related. If table AMs want something to skip some blocks, they
could easily implement it.
On another topic, I think it would be good to have a comment above
this code in parallel_lazy_scan_gather_scan_results(), stating why we
are very sure it is correct.
Assert(TransactionIdIsValid(data->NewRelfrozenXid));
Assert(MultiXactIdIsValid(data->NewRelminMxid));
if (TransactionIdPrecedes(data->NewRelfrozenXid,
vacrel->scan_data->NewRelfrozenXid))
vacrel->scan_data->NewRelfrozenXid = data->NewRelfrozenXid;
if (MultiXactIdPrecedesOrEquals(data->NewRelminMxid,
vacrel->scan_data->NewRelminMxid))
vacrel->scan_data->NewRelminMxid = data->NewRelminMxid;
if (data->nonempty_pages < vacrel->scan_data->nonempty_pages)
vacrel->scan_data->nonempty_pages = data->nonempty_pages;
vacrel->scan_data->skippedallvis |= data->skippedallvis;
Parallel relfrozenxid advancement sounds scary, and scary things are
best with comments. Even though the way this works is intuitive, I
think it is worth pointing out that this part is important to get
right so future programmers know how important it is.
One thing I was wondering about is if there are any implications of
different workers having different values in their GlobalVisState.
GlobalVisState can be updated during vacuum, so even if they start out
with the same values, that could diverge. It is probably okay since it
just controls what tuples are removable. Some workers may remove fewer
tuples than they absolutely could, and this is probably okay.
And if it is okay for each worker to have different GlobalVisState
then maybe you shouldn't have a GlobalVisState in shared memory. If
you look at GlobalVisTestFor() it just returns the memory address of
that global variable in the backend. So, it seems like it might be
better to just let each parallel worker use their own backend local
GlobalVisState and not try to put it in shared memory and copy it from
one worker to the other workers when initializing them. I'm not sure.
At the very least, there should be a comment explaining why you've
done it the way you have done it.
- Melanie
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2025-03-26 20:33:49 | Re: AIO v2.5 |
Previous Message | Devulapalli, Raghuveer | 2025-03-26 19:55:22 | RE: Improve CRC32C performance on SSE4.2 |