From: | "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com> |
---|---|
To: | 'Masahiko Sawada' <sawada(dot)mshk(at)gmail(dot)com> |
Cc: | Peter Smith <smithpb2250(at)gmail(dot)com>, John Naylor <johncnaylorls(at)gmail(dot)com>, Tomas Vondra <tomas(at)vondra(dot)me>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Melanie Plageman <melanieplageman(at)gmail(dot)com> |
Subject: | RE: Parallel heap vacuum |
Date: | 2025-04-18 09:49:01 |
Message-ID: | OSCPR01MB1496624D9C7A9669A0128999BF5BF2@OSCPR01MB14966.jpnprd01.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Dear Sawada-san,
Thanks for updating the patch. I have been reviewing and below are comments for now.
01.
Sorry if I forgot something, but is there a reason that
parallel_vacuum_compute_workers() is mandatory? My fresh eye felt that the function
can check and regards zero if the API is NULL.
02. table_parallel_vacuum_estimate
We can assert that state is not NULL.
03. table_parallel_vacuum_initialize
Same as 02.. Also, I think we should ensure here that table_parallel_vacuum_estimate()
has already been called before, and current assertion might not enough because
others can set random value here. Other functions like table_rescan_tidrange()
checks the internal flag of the data structure, which is good for me. How do you
feel to check pcxt->estimator has non-zero value?
04. table_parallel_vacuum_initialize_worker
Same as 02. Also, can we esure that table_parallel_vacuum_initialize() has been
done?
05. table_parallel_vacuum_collect_dead_items
Same as 02. Also, can we esure that table_parallel_vacuum_initialize_worker()
has been done?
06. table_parallel_vacuum_initialize_worker
Comments atop function needs to be updated.
07.
While playing, I found that the parallel vacuum worker can be launched more than
pages:
```
postgres=# CREATE TABLE var (id int);
CREATE TABLE
postgres=# INSERT INTO var VALUES (generate_series(1, 10000));
INSERT 0 10000
postgres=# VACUUM (parallel 80, verbose) var ;
INFO: vacuuming "postgres.public.var"
INFO: launched 80 parallel vacuum workers for collecting dead tuples (planned: 80)
INFO: finished vacuuming "postgres.public.var": index scans: 0
pages: 0 removed, 45 remain, 45 scanned (100.00% of total), 0 eagerly scanned
...
```
I hope the minimum chunk size is a page so that this situation can reduce the performance.
How do you feel to cap the value with rel::rd_rel::relpages in heap_parallel_vacuum_compute_workers()?
This value is not always up-to-date but seems good candidate.
Best regards,
Hayato Kuroda
FUJITSU LIMITED
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2025-04-18 11:02:13 | Re: Changing shared_buffers without restart |
Previous Message | Fujii Masao | 2025-04-18 09:45:39 | Re: Align memory context level numbering in pg_log_backend_memory_contexts() |