From: | David Rowley <dgrowleyml(at)gmail(dot)com> |
---|---|
To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: maintenance_work_mem = 64kB doesn't work for vacuum |
Date: | 2025-03-10 09:53:23 |
Message-ID: | CAApHDvps_sLPtBVZLyi--bmcjDNwqfg2eApQk9muYG-UrEi_nA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, 10 Mar 2025 at 17:22, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> Regarding that patch, we need to note that the lpdead_items is a
> counter that is not reset in the entire vacuum. Therefore, with
> maintenance_work_mem = 64kB, once we collect at least one lpdead item,
> we perform a cycle of index vacuuming and heap vacuuming for every
> subsequent block even if they don't have a lpdead item. I think we
> should use vacrel->dead_items_info->num_items instead.
OK, I didn't study the code enough to realise that. My patch was only
intended as an indication of what I thought. Please feel free to
proceed with your own patch using the correct field.
When playing with parallel vacuum, I also wondered if there should be
some heuristic that avoids parallel vacuum unless the user
specifically asked for it in the command when maintenance_work_mem is
set to something far too low.
Take the following case as an example:
set maintenance_work_mem=64;
create table aa(a int primary key, b int unique);
insert into aa select a,a from generate_Series(1,1000000) a;
delete from aa;
-- try a vacuum with no parallelism
vacuum (verbose, parallel 0) aa;
system usage: CPU: user: 0.53 s, system: 0.00 s, elapsed: 0.57 s
If I did the following instead:
vacuum (verbose) aa;
The vacuum goes parallel and it takes a very long time due to
launching a parallel worker to do 1 page worth of tuples. I see the
following message 4425 times
INFO: launched 1 parallel vacuum worker for index vacuuming (planned: 1)
and takes about 30 seconds to complete: system usage: CPU: user: 14.00
s, system: 0.81 s, elapsed: 30.86 s
Shouldn't the code in parallel_vacuum_compute_workers() try and pick a
good value for the workers based on the available memory and table
size when the user does not explicitly specify how many workers they
want?
David
From | Date | Subject | |
---|---|---|---|
Next Message | Steven Niu | 2025-03-10 10:07:34 | Re: [Patch] remove duplicated smgrclose |
Previous Message | Peter Eisentraut | 2025-03-10 09:49:15 | Re: 64 bit numbers vs format strings |