Re: Massive delete from a live production DB

From: Bill Moran <wmoran(at)potentialtech(dot)com>
To: Phoenix Kiula <phoenix(dot)kiula(at)gmail(dot)com>
Cc: PG-General Mailing List <pgsql-general(at)postgresql(dot)org>
Subject: Re: Massive delete from a live production DB
Date: 2011-05-12 15:24:23
Message-ID: 20110512112423.8d132b1a.wmoran@potentialtech.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

In response to Phoenix Kiula <phoenix(dot)kiula(at)gmail(dot)com>:

> Hi
>
> Been reading some old threads (pre 9.x version) and it seems that the
> consensus is to avoid doing massive deletes from a table as it'll
> create so much unrecoverable space/gaps that vacuum full would be
> needed. Etc.
>
> Instead, we might as well do a dump/restore. Faster, cleaner.
>
> This is all well and good, but what about a situation where the
> database is in production and cannot be brought down for this
> operation or even a cluster?
>
> Any ideas on what I could do without losing all the live updates? I
> need to get rid of about 11% of a 150 million rows of database, with
> each row being nearly 1 to 5 KB in size...

Have you considered the following process:

1) SELECT the rows you want to keep into a new table (time-consuming)
2) Start outage
3) Pull over any new rows that might have been added between 1 & 2
4) Drop the old table
5) Rename the new table to the old name
6) Any other steps required to make the new table exactly like
the old one (i.e. foreign keys, serials, etc)
7) End outage window

Because steps 3 - 6 are very fast, your outage window is very short.
Not a perfect, 0 downtime solution, but possibly helpful.

--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Marti Raudsepp 2011-05-12 15:31:29 Re: Massive delete from a live production DB
Previous Message Eric Ndengang 2011-05-12 15:13:35 Re: Massive delete from a live production DB