Re: Options for more aggressive space reclamation in vacuuming?

From: Paul Smith <paul(at)pscs(dot)co(dot)uk>
To: Wells Oliver <wells(dot)oliver(at)gmail(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>,"pgsql-admin" <pgsql-admin(at)postgresql(dot)org>
Subject: Re: Options for more aggressive space reclamation in vacuuming?
Date: 2023-05-14 23:11:27
Message-ID: 1881c887d18.2923.9bfe8bcc586ac955e423c0e3d5444448@pscs.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

An autovacuum will not release much (if any) space back to the OS. That's
perfectly normal. It just marks it as available for new dará to occupy.

So, if data is being added/removed at a similar rare, the table size will
reach an equilibrium with lots of dead tuples that are being reused

If autovacuum is happening too infrequently, then there'll be far too many
dead tuples. If it's happening too frequently, then there'll be too much
load on the IO system

Personally, I'd say 0.1 is still too high for a billion+ row table. That's
still 100+ million dead tuples. That's a lot, plus it'll be slow to
process. I'd use 0.001 (1+ million dead tuples), or maybe even lower.

Depending on the delete rate, you could even set the scale_factor to 0, and
the autovacuum_vacuum_threshold to a number appropriate for an autovacuum
every few days.

Paul

On 14 May 2023 23:20:19 Wells Oliver <wells(dot)oliver(at)gmail(dot)com> wrote:
> Yeah, We've tuned them down to a 0.1 factor and they are indeed being
> auto-vacuumed (though perhaps not frequently enough). I feel like I am just
> seeing the autovacuuming releasing a _lot_ less space than a full vacuum.
> Of course, I know that this is generally the case, but I am surprised by
> the factor... Nothing we can't live with, I was just wondering if there
> wasn't some clever solution.
>
> Perhaps lowering the auto vacuum factor even lower would be useful here too.
>
>
> On Sun, May 14, 2023 at 3:14 PM Paul Smith <paul(at)pscs(dot)co(dot)uk> wrote:
> Have you tuned the autovacuum settings? We've found the defaults to not be
> great for large tables
>
> Eg autovacuum_vacuum_scale_factor defaults to 0.2, so 20% of the tuples
> need to be deleted/updated before a vacuum us triggered. If there are a
> billion rows, that means there needs to be 200 million deleted rows to
> trigger the autovacuum
>
> https://www.postgresql.org/docs/current/routine-vacuuming.html#AUTOVACUUM
>
> Maybe try changing autovacuum_vacuum_scale_factor to 0.001 or something for
> that large table
>
>
> Paul
> On 14 May 2023 22:03:41 Wells Oliver <wells(dot)oliver(at)gmail(dot)com> wrote:
>> I think our use-case here is nothing special: just very large tables that
>> are deleted from and re-inserted with some frequency. They store monthly
>> data and exist as partitions. As an example, July of 2022 (summer months
>> are peak) had 1,630,558,336 rows. We delete from this and insert daily. We
>> rarely get new, relevant data once the month is over. The n_dead_tup from
>> pg_stat_all_tables here was over 7m rows, and clearing that out gave us
>> back nearly 50GB, and the file size estimate on this partition was ~200 GB.
>> These tables get auto-vacuumed but clearly it's not, well, aggressively
>> reclaiming space.
>>
>> Given that they are partitions, VACUUM FULL locks the parent table, so new
>> writes/deletes can't happen, and anyway people like to run queries against
>> these tables quite frequently.
>>
>>
>> Glad to add any detail I can.
>>
>>
>> On Sun, May 14, 2023 at 11:32 AM Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
>>
>> On Sun, May 14, 2023 at 1:46 PM Wells Oliver <wells(dot)oliver(at)gmail(dot)com> wrote:
>>
>> Noticed over the last few days how much more space VACUUM FULL reclaims
>> (for obvious reasons) and I am wondering if there are any solutions for a
>> more fruitful VACUUM without the locks of FULL?
>>
>>
>>
>> There are lots of theories which are all compatible with the minimal amount
>> of info given here. If the space is freed from the indexes, then just
>> reindexing (which can be done concurrently in recent versions) could
>> reclaim the space. If it is from the table itself then something like
>> pg_squeeze (https://github.com/cybertec-postgresql/pg_squeeze/) might help.
>> Or you can implement your own squeezing, by using a CTE with a ctid scan to
>> delete tuples from the end of the table and reinsert them in the front,
>> then use regular vacuum to truncate the now-free space off the end of the
>> table.
>>
>>
>> I would think the more fruitful VACUUM would be to prevent the bloat from
>> occuring in the first place but without knowing the root cause I can't
>> offer anything but platitudes there.
>>
>>
>> Cheers,
>>
>>
>> Jeff
>>
>>
>>
>> --
>>
>> Wells Oliver
>> wells(dot)oliver(at)gmail(dot)com
>
>
>
> --
> Wells Oliver
> wells(dot)oliver(at)gmail(dot)com

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Peter Geoghegan 2023-05-15 00:34:05 Re: Options for more aggressive space reclamation in vacuuming?
Previous Message Ron 2023-05-14 22:43:07 Re: Options for more aggressive space reclamation in vacuuming?