From: | "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com> |
---|---|
To: | 'Masahiko Sawada' <sawada(dot)mshk(at)gmail(dot)com> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | RE: Parallel heap vacuum |
Date: | 2024-11-13 11:10:43 |
Message-ID: | TYAPR01MB5692772248611889010B6823F55A2@TYAPR01MB5692.jpnprd01.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Dear Sawada-san,
> TidStoreBeginIterateShared() is designed for multiple parallel workers
> to iterate a shared TidStore. During an iteration, parallel workers
> share the iteration state and iterate the underlying radix tree while
> taking appropriate locks. Therefore, it's available only for a shared
> TidStore. This is required to implement the parallel heap vacuum,
> where multiple parallel workers do the iteration on the shared
> TidStore.
>
> On the other hand, TidStoreBeginIterate() is designed for a single
> process to iterate a TidStore. It accepts even a shared TidStore as
> you mentioned, but during an iteration there is no inter-process
> coordination such as locking. When it comes to parallel vacuum,
> supporting TidStoreBeginIterate() on a shared TidStore is necessary to
> cover the case where we use only parallel index vacuum but not
> parallel heap scan/vacuum. In this case, we need to store dead tuple
> TIDs on the shared TidStore during heap scan so parallel workers can
> use it during index vacuum. But it's not necessary to use
> TidStoreBeginIterateShared() because only one (leader) process does
> heap vacuum.
Okay, thanks for the description. I felt it is OK to keep.
I read 0001 again and here are comments.
01. vacuumlazy.c
```
+#define LV_PARALLEL_SCAN_SHARED 0xFFFF0001
+#define LV_PARALLEL_SCAN_DESC 0xFFFF0002
+#define LV_PARALLEL_SCAN_DESC_WORKER 0xFFFF0003
```
I checked other DMS keys used for parallel work, and they seems to have name
like PARALEL_KEY_XXX. Can we follow it?
02. LVRelState
```
+ BlockNumber next_fsm_block_to_vacuum;
```
Only the attribute does not have comments Can we add like:
"Next freespace map page to be checked"?
03. parallel_heap_vacuum_gather_scan_stats
```
+ vacrel->scan_stats->vacuumed_pages += ss->vacuumed_pages;
```
Note that `scan_stats->vacuumed_pages` does not exist in 0001, it is defined
in 0004. Can you move it?
04. heap_parallel_vacuum_estimate
```
+
+ heap_parallel_estimate_shared_memory_size(rel, nworkers, &pscan_len,
+ &shared_len, &pscanwork_len);
+
+ /* space for PHVShared */
+ shm_toc_estimate_chunk(&pcxt->estimator, shared_len);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+ /* space for ParallelBlockTableScanDesc */
+ pscan_len = table_block_parallelscan_estimate(rel);
+ shm_toc_estimate_chunk(&pcxt->estimator, pscan_len);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+ /* space for per-worker scan state, PHVScanWorkerState */
+ pscanwork_len = mul_size(sizeof(PHVScanWorkerState), nworkers);
+ shm_toc_estimate_chunk(&pcxt->estimator, pscanwork_len);
+ shm_toc_estimate_keys(&pcxt->estimator, 1);
```
I feel pscan_len and pscanwork_len are calclated in heap_parallel_estimate_shared_memory_size().
Can we remove table_block_parallelscan_estimate() and mul_size() from here?
05. Idea
Can you update documentations?
06. Idea
AFAICS pg_stat_progress_vacuum does not contain information related with the
parallel execution. How do you think adding an attribute which shows a list of pids?
Not sure it is helpful for users but it can show the parallelism.
Best regards,
Hayato Kuroda
FUJITSU LIMITED
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2024-11-13 11:53:01 | Re: logical replication: restart_lsn can go backwards (and more), seems broken since 9.4 |
Previous Message | Alvaro Herrera | 2024-11-13 11:03:05 | Re: Enable data checksums by default |