From: | "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com> |
---|---|
To: | 'Masahiko Sawada' <sawada(dot)mshk(at)gmail(dot)com> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | RE: Parallel heap vacuum |
Date: | 2024-07-05 09:51:22 |
Message-ID: | OSBPR01MB25522CC3B4F443AF24AC620DF5DF2@OSBPR01MB2552.jpnprd01.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Dear Sawada-san,
> The parallel vacuum we have today supports only for index vacuuming.
> Therefore, while multiple workers can work on different indexes in
> parallel, the heap table is always processed by the single process.
> I'd like to propose $subject, which enables us to have multiple
> workers running on the single heap table. This would be helpful to
> speedup vacuuming for tables without indexes or tables with
> INDEX_CLENAUP = off.
Sounds great. IIUC, vacuuming is still one of the main weak point of postgres.
> I've attached a PoC patch for this feature. It implements only
> parallel heap scans in lazyvacum. We can extend this feature to
> support parallel heap vacuum as well in the future or in the same
> patch.
Before diving into deep, I tested your PoC but found unclear point.
When the vacuuming is requested with parallel > 0 with almost the same workload
as yours, only the first page was scanned and cleaned up.
When parallel was set to zero, I got:
```
INFO: vacuuming "postgres.public.test"
INFO: finished vacuuming "postgres.public.test": index scans: 0
pages: 0 removed, 2654868 remain, 2654868 scanned (100.00% of total)
tuples: 120000000 removed, 480000000 remain, 0 are dead but not yet removable
removable cutoff: 752, which was 0 XIDs old when operation ended
new relfrozenxid: 739, which is 1 XIDs ahead of previous value
frozen: 0 pages from table (0.00% of total) had 0 tuples frozen
index scan not needed: 0 pages from table (0.00% of total) had 0 dead item identifiers removed
avg read rate: 344.639 MB/s, avg write rate: 344.650 MB/s
buffer usage: 2655045 hits, 2655527 misses, 2655606 dirtied
WAL usage: 1 records, 1 full page images, 937 bytes
system usage: CPU: user: 39.45 s, system: 20.74 s, elapsed: 60.19 s
```
This meant that all pages were surely scanned and dead tuples were removed.
However, when parallel was set to one, I got another result:
```
INFO: vacuuming "postgres.public.test"
INFO: launched 1 parallel vacuum worker for table scanning (planned: 1)
INFO: finished vacuuming "postgres.public.test": index scans: 0
pages: 0 removed, 2654868 remain, 1 scanned (0.00% of total)
tuples: 12 removed, 0 remain, 0 are dead but not yet removable
removable cutoff: 752, which was 0 XIDs old when operation ended
frozen: 0 pages from table (0.00% of total) had 0 tuples frozen
index scan not needed: 0 pages from table (0.00% of total) had 0 dead item identifiers removed
avg read rate: 92.952 MB/s, avg write rate: 0.845 MB/s
buffer usage: 96 hits, 660 misses, 6 dirtied
WAL usage: 1 records, 1 full page images, 937 bytes
system usage: CPU: user: 0.05 s, system: 0.00 s, elapsed: 0.05 s
```
It looked like that only a page was scanned and 12 tuples were removed.
It looks very strange for me...
Attached script emulate my test. IIUC it was almost the same as yours, but
the instance was restarted before vacuuming.
Can you reproduce and see the reason? Based on the requirement I can
provide further information.
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/
Attachment | Content-Type | Size |
---|---|---|
parallel_test.sh | application/octet-stream | 758 bytes |
From | Date | Subject | |
---|---|---|---|
Next Message | Alexander Lakhin | 2024-07-05 10:00:01 | Re: BUG: Postgres 14 + vacuum_defer_cleanup_age + FOR UPDATE + UPDATE |
Previous Message | 杨伯宇 (长堂) | 2024-07-05 09:24:42 | 回复:Re: speed up pg_upgrade with large number of tables |