From: | Melanie Plageman <melanieplageman(at)gmail(dot)com> |
---|---|
To: | Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie> |
Subject: | Emit fewer vacuum records by reaping removable tuples during pruning |
Date: | 2023-11-13 22:28:50 |
Message-ID: | CAAKRu_bgvb_k0gKOXWzNKWHt560R0smrGe3E8zewKPs8fiMKkw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
When there are no indexes on the relation, we can set would-be dead
items LP_UNUSED and remove them during pruning. This saves us a vacuum
WAL record, reducing WAL volume (and time spent writing and syncing
WAL).
See this example:
drop table if exists foo;
create table foo(a int) with (autovacuum_enabled=false);
insert into foo select i from generate_series(1,10000000)i;
update foo set a = 10;
\timing on
vacuum foo;
On my machine, the attached patch set provides a 10% speedup for vacuum
for this example -- and a 40% decrease in WAL bytes emitted.
Admittedly, this case is probably unusual in the real world. On-access
pruning would preclude it. Throw a SELECT * FROM foo before the vacuum
and the patch has no performance benefit.
However, it has no downside as far as I can tell. And, IMHO, it is a
code clarity improvement. This change means that lazy_vacuum_heap_page()
is only called when we are actually doing a second pass and reaping dead
items. I found it quite confusing that lazy_vacuum_heap_page() was
called by lazy_scan_heap() to set dead items unused in a block that we
just pruned.
I think it also makes it clear that we should update the VM in
lazy_scan_prune(). All callers of lazy_scan_prune() will now consider
updating the VM after returning. And most of the state communicated back
to lazy_scan_heap() from lazy_scan_prune() is to inform it whether or
not to update the VM. I didn't do that in this patch set because I would
need to pass all_visible_according_to_vm to lazy_scan_prune() and that
change didn't seem worth the improvement in code clarity in
lazy_scan_heap().
I am planning to add a VM update into the freeze record, at which point
I will move the VM update code into lazy_scan_prune(). This will then
allow us to consolidate the freespace map update code for the prune and
noprune cases and make lazy_scan_heap() short and sweet.
Note that (on principle) this patch set is on top of the bug fix I
proposed in [1].
- Melanie
Attachment | Content-Type | Size |
---|---|---|
v1-0003-Set-would-be-dead-items-LP_UNUSED-while-pruning.patch | application/x-patch | 16.6 KB |
v1-0001-Release-lock-on-heap-buffer-before-vacuuming-FSM.patch | application/x-patch | 2.1 KB |
v1-0002-Indicate-rel-truncation-unsafe-in-lazy_scan-no-pr.patch | application/x-patch | 7.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Nathan Bossart | 2023-11-13 22:42:31 | Re: archive modules loose ends |
Previous Message | Melanie Plageman | 2023-11-13 22:13:32 | lazy_scan_heap() should release lock on buffer before vacuuming FSM |