From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
---|---|
To: | Tomas Vondra <tomas(at)vondra(dot)me> |
Cc: | "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Parallel heap vacuum |
Date: | 2024-12-11 08:07:02 |
Message-ID: | CAD21AoAtw4DUrxz=E1WWZ7gbc=8rMY_nxZ7XxV90xmZDaK8Fig@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Dec 9, 2024 at 2:11 PM Tomas Vondra <tomas(at)vondra(dot)me> wrote:
>
> Hi,
>
> Thanks for working on this. I took a quick look at this today, to do
> some basic review. I plan to do a bunch of testing, but that's mostly to
> get a better idea of what kind of improvements to expect - the initial
> results look quite nice and sensible.
Thank you for reviewing the patch!
> A couple basic comments:
>
> 1) I really like the idea of introducing "compute_workers" callback to
> the heap AM interface. I faced a similar issue with calculating workers
> for index builds, because right now plan_create_index_workers is doing
> that the logic works for btree, but really not brin etc. It didn't occur
> to me we might make this part of the index AM ...
Thanks.
>
> 2) I find it a bit weird vacuumlazy.c needs to include optimizer/paths.h
> because it really has nothing to do with planning / paths. I realize it
> needs the min_parallel_table_scan_size, but it doesn't seem right. I
> guess it's a sign this bit of code (calculating parallel workers based
> on log of relation size) should in some "shared" location.
True. The same is actually true also for vacuumparallel.c. It includes
optimizer/paths.h to use min_parallel_index_scan_size.
>
> 3) The difference in naming ParallelVacuumState vs. PHVState is a bit
> weird. I suggest ParallelIndexVacuumState and ParallelHeapVacuumState to
> make it consistent and clear.
With the patch, since ParallelVacuumState is no longer dedicated for
parallel index vacuuming we cannot rename them in this way. Both
parallel table scanning/vacuuming and parallel index vacuuming can use
the same ParallelVacuumState instance. The heap-specific necessary
data for parallel heap scanning and vacuuming are stored in PHVState.
>
> 4) I think it would be good to have some sort of README explaining how
> the parallel heap vacuum works, i.e. how it's driven by FSM. Took me a
> while to realize how the workers coordinate which blocks to scan.
+1. I will add README in the next version patch.
>
> 5) Wouldn't it be better to introduce the scan_stats (grouping some of
> the fields in a separate patch)? Seems entirely independent from the
> parallel part, so doing it separately would make it easier to review.
> Also, maybe reference the fields through scan_stats->x, instead of
> through vacrel->scan_stats->x, when there's the pointer.
Agreed.
> 6) Is it a good idea to move NewRelfrozenXID/... to the scan_stats?
> AFAIK it's not a statistic, it's actually a parameter affecting the
> decisions, right?
Right. It would be better to move them to a separate struct or somewhere.
>
> 7) I find it a bit strange that heap_vac_scan_next_block() needs to
> check if it's a parallel scan, and redirect to the parallel callback. I
> mean, shouldn't the caller know which callback to invoke? Why should the
> serial callback care about this?
do_lazy_scan_heap(), the sole caller of heap_vac_scan_next_block(), is
called in serial vacuum and parallel vacuum cases. I wanted to make
heap_vac_scan_next_block() workable in both cases. I think it also
makes sense to have do_lazy_scan_heap() calls either function
depending on parallel scan being enabled.
>
> 8) It's not clear to me why do_lazy_scan_heap() needs to "advertise" the
> current block. Can you explain?
The workers' current block numbers are used to calculate the minimum
block number where we've scanned so far. In serial scan case, we
vacuum FSM of the particular block range for every
VACUUM_FSM_EVERY_PAGES pages . On the other hand, in parallel scan
case, it doesn't make sense to vacuum FSM in that way because we might
not have processed some blocks in the block range. So the idea is to
calculate the minimum block number where we've scanned so far and
vacuum FSM of the range of consecutive already-scanned blocks.
>
> 9) I'm a bit confused why the code checks IsParallelWorker() in so many
> places. Doesn't that mean the leader can't participate in the vacuum
> like a regular worker?
I used '!isParallelWorker()' for some jobs that should be done only by
the leader process. For example, checking failsafe mode, vacuuming FSM
etc.
>
> 10) I'm not quite sure I understand the comments at the end of
> do_lazy_scan_heap - it says "do a cycle of vacuuming" but I guess that
> means "index vacuuming", right?
It means both index vacuuming and heap vacuuming.
> And then it says "pause without invoking
> index and heap vacuuming" but isn't the whole point of this block to do
> that cleanup so that the TidStore can be discarded? Maybe I just don't
> understand how the work is divided between the leader and workers ...
The comment needs to be updated. But what the patch does is that when
the memory usage of the shared TidStore reaches the limit, worker
processes exit after updating the shared statistics, and then the
leader invokes (parallel) index vacuuming and parallel heap vacuuming.
Since the different number workers could be used for parallel heap
scan, parallel index vacuuming, and parallel heap vacuuming, the
leader process waits for all workers to finish at end of each phase.
> 11) Why does GlobalVisState need to move to snapmgr.h? If I undo this
> the patch still builds fine for me.
Oh, I might have missed something. I'll check if it's really necessary.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | Daniel Gustafsson | 2024-12-11 08:13:35 | Re: Serverside SNI support in libpq |
Previous Message | Andrei Lepikhov | 2024-12-11 08:04:01 | Re: Expand applicability of aggregate's sortop optimization |