Re: Schedule pg_repack job with pg_cron

From: Ron Johnson <ronljohnsonjr(at)gmail(dot)com>
To: Pgsql-admin <pgsql-admin(at)lists(dot)postgresql(dot)org>
Subject: Re: Schedule pg_repack job with pg_cron
Date: 2024-08-07 19:39:33
Message-ID: CANzqJaDnbxuMCV1vF+GfuA=bFWuMo+45cnQK1H4AW3YBqi23eg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

On Wed, Aug 7, 2024 at 3:29 PM Rui DeSousa <rui(dot)desousa(at)icloud(dot)com> wrote:

> On Aug 7, 2024, at 9:24 AM, jacob ndinkwa <jndinkwa(at)gmail(dot)com> wrote:
>
> To schedule a pg_repack job using pg_cron on an Amazon RDS for PostgreSQL
> instance, you need to follow a few steps. However, it’s important to note
> that pg_cron is only supported on certain versions of Amazon RDS, and
> pg_repack is also a separate extension that must be installed and enabled.
>
>
> Is scheduling pg_repack just a bad idea and just introducing just more
> bloat? Why not just tune auto vacuum?
>
> 80/20 rule… most schemas are going to have their large/hot tables, etc and
> data has a natural life cycle. If you have a heathy application then bloat
> is not an issue as free space is used by new tuples. Each database has a
> data flow to it depending on the maturity and nature of the
> application/database. Exiting tuples make room for new tuples, etc.
>
> If your have to vacuum full / pg_repack your tables on a scheduled bases
> then I think there is something very wrong with your application.
>
> Pg_repack will do more harm in the long run. i.e. the entire time
> pg_repack is running xmin is frozen thus creating more bloat everywhere
> else!
>
> Bloat is overrated; especially in a transaction system where all your data
> access patterns should be well defined and not doing full table scans. Just
> focus on identifying bloated indexes periodically and rebuilding those.
> There should be no need to vacuum full tables under normal circumstances.
>

Part of a properly-maintained system is *regularly* archive/purging
(whether that be dropping date-based partitions, or deleting old data from
unpartitioned tables or tables partitioned by something other than a date).

For example, I gave a list of tables (all intertwined via FK constraints)
to the application support people, and they returned the list stating how
many weeks or months of data to retain in each table. Every Saturday night
a cron job goes through and deletes the old data from, and then "manually"
vacuum-analyzes them.

No bloat...

--
Death to America, and butter sauce.
Iraq lobster!

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Teja Jakkidi 2024-08-07 20:47:26 Statement_timeout effect on replication user
Previous Message Rui DeSousa 2024-08-07 19:29:15 Re: Schedule pg_repack job with pg_cron