Re: time taking deletion on large tables

From: Rui DeSousa <rui(at)crazybean(dot)net>
To: Atul Kumar <akumar14871(at)gmail(dot)com>
Cc: pgsql-admin <pgsql-admin(at)lists(dot)postgresql(dot)org>
Subject: Re: time taking deletion on large tables
Date: 2020-12-03 18:45:33
Message-ID: 35E33498-8CD9-4D81-A641-42343342D337@crazybean.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin pgsql-docs pgsql-in-general pgsql-performance

> On Dec 3, 2020, at 9:45 AM, Atul Kumar <akumar14871(at)gmail(dot)com> wrote:
>
> The feed_posts table has over 50 Million rows.
>
> When I m deleting all rows of a certain type that are over 60 days old.
>
> When I try to do a delete like this: it hangs for an entire day, so I
> need to kill it with pg_terminate_backend(pid).

Delete the records in batches. I have used this approach many times successfully for large tables that are highly active on live production systems.

You’ll have to find the correct batch size to use for your dataset while keeping the run time short; i.e. 30 seconds. Then repeatedly call the function using a script — I’ve used a perl script with the DBI module to accomplish it.

i.e.

create or replace function purge_feed_post (_purge_date date, _limit int default 5000)
returns int
as
$$
declare
_rowcnt int;
begin
create temp table if not exists purge_feed_post_set (
feed_post_id int
)
;

/* Identify records to be purged */
insert into purge_feed_post_set (
feed_post_id
)
select feed_post_id
from feed_posts
where created_at < _purge_date
order by created_at
limit _limit
;

/* Remove old records */
delete from feed_posts using purge_feed_post_set
where feed_posts.feed_post_id = purge_feed_post_set.feed_post_id
;

get diagnostics _rowcnt = ROW_COUNT;

delete from purge_feed_post_set;

return _rowcnt;
end;
$$ language plpgsql
set search_path = public
;

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Henry Francisco Garcia Cortez 2020-12-03 19:41:58 autovacuum_work_mem and maintenance_work_mem
Previous Message Yambu 2020-12-03 18:28:04 Moving a table/index to different tablespace

Browse pgsql-docs by date

  From Date Subject
Next Message William Sescu (Suva) 2020-12-04 08:15:43 Schemaverse pointing to a dead link
Previous Message Andrew Dunstan 2020-12-03 17:00:44 Re: time taking deletion on large tables

Browse pgsql-in-general by date

  From Date Subject
Next Message Piyush 2021-09-06 10:59:04 Pg Server-side-cursors
Previous Message Andrew Dunstan 2020-12-03 17:00:44 Re: time taking deletion on large tables

Browse pgsql-performance by date

  From Date Subject
Next Message Nagaraj Raj 2020-12-03 19:58:15 Temporarily disable not null constraints
Previous Message Andrew Dunstan 2020-12-03 17:00:44 Re: time taking deletion on large tables