Re: Reg data purging/archiving

From: Ron Johnson <ronljohnsonjr(at)gmail(dot)com>
To: Pgsql-admin <pgsql-admin(at)lists(dot)postgresql(dot)org>
Subject: Re: Reg data purging/archiving
Date: 2024-04-05 20:39:06
Message-ID: CANzqJaBqFaDq6b7Uepy6+bHKbVCpMz6bECOX9Yr4aOVitiOB8g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

On Fri, Apr 5, 2024 at 3:14 PM srinivasan s <srinioracledba7(at)gmail(dot)com>
wrote:

> Dear all,
>
> I hope this message finds you well.
>
> I would greatly appreciate your input and suggestions regarding the
> removal of data from our largest table, which currently exceeds 23TB and
> contains several years worth of data. Our objective is to retain only the
> data from the past two years in this table. It is important to note that
> this table is not partitioned.
>

> Considering the critical nature of our business database, it is imperative
> that the purging process does not adversely affect the ongoing production
> performance. Therefore, I am seeking the best approach to accomplish this
> task without compromising system performance.
>
> One potential solution I am considering is to create a new table and
> gradually transfer the data from the large table to the new one in smaller
> batches. Once the data transfer is complete, we can then rename the tables
> accordingly and perform a final cutover. However, I am uncertain if this
> approach will yield the desired results, and I believe it is necessary to
> conduct thorough testing before proceeding.
>

Is the table ever updated? An INSERT-only table would make such a scheme
much easier to implement.

If any of you have suggestions or ideas on how to efficiently purge data
> older than two years from such a massive table without impacting production
> performance, I would greatly appreciate your insights.
>

1. What PG version are you running?
2. How many indices?
3. Does an index on the relevant date field exist? That also would make
such a scheme much easier to implement.
4. How much down time can you get?
4. If nothing else, chip away at the old data, a few "table days" every
calendar day". Naturally, a supporting index is required, and
vacuum+analyze required after a round of deletes.

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message M Sarwar 2024-04-06 04:33:06 How to tune SQL performance of function based columns of a view
Previous Message Naveen Kumar 2024-04-05 19:43:05 Re: Create Materialized View from postgresql_fdw hang