Re: Getting the list of foreign keys (for deleting data from the database)

From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: Mario Splivalo <mario(at)splivalo(dot)hr>, pgsql-sql(at)postgresql(dot)org
Subject: Re: Getting the list of foreign keys (for deleting data from the database)
Date: 2015-08-02 15:25:07
Message-ID: 55BE3653.8020207@aklaver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

On 08/02/2015 08:04 AM, Mario Splivalo wrote:
> I have a large, in-house built, ERP system that I need to clean up from
> old/stale data.
>
> As all the tables are FK-related I could do 'DELETE FROM' from the
> top-most table (invoices, or stock documents, or whatever) to remove all
> data from all the related tables, but that is, of course, extremely slow
> (The datadir is around 20GB in size, and I need to remove 4/5 of the
> data from the database - fiscal years 2014, 2013, 2012 and 2011 - only
> 2015 should remain).

I have an answer of sorts below.

I do have some questions in the meantime though.

What is the purpose of an ERP that has no history?

In particular how do you do the P(lan) part without reference to the past?

>
> Instead of doing DELETE FROM table WHERE date_created < '2015-01-01' I
> was thinking of doing something like this:
>
> SELECT foo_drop_all_constraints();
> SELECT * FROM table INTO table_copy WHERE date_created >= '2015-01-01';
> DROP TABLE table;
> ALTER TABLE table_copy RENAME TO table;
> SELECT foo_restore_all_constraints();
>
> Of course, this is simple if I have only one table, but when there is
> over 400 tables that are 'linked' with foreign keys, things get a bit
> complicated.
>
> Suppose I have a table_detail that has column table_id which is FK
> pointing to table(id), I would need to do something like this:
>
> SELECT foo_drop_all_constraints();
> SELECT * FROM table INTO table_copy WHERE date_created >= '2015-01-01';
> SELECT table_detail.* INTO table_detail_copy FROM table_detail JOIN
> table_copy ON table_detail.table_id = table_copy.id
> DROP TABLE table;
> DROP TABLE table_copy
> ALTER TABLE table_copy RENAME TO table;
> ALTER TABLE table_detail_copy TO table_detail;
> SELECT foo_restore_all_constraints();
>
> Now, what am I asking is - is there a tool which would help me find all
> the _detail tables? I know I could query pg_constraints and similar
> views but before I go onto hacking into those I'm wondering if there is
> something that could aid me in doing so.

My guess is for this case it will be less resource intensive to just do
the DELETE(s), in smaller batches then a year, then to replicate the
referential integrity in your own code.

If this is going to be a regular(yearly) thing I would look at partitioning:

http://www.postgresql.org/docs/9.4/static/ddl-partitioning.html

>
> Of course, if this is not the best approach I'd appreciate different
> views/opinions.
>
> Mario
>
>

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Mario Splivalo 2015-08-02 15:44:11 Re: Getting the list of foreign keys (for deleting data from the database)
Previous Message Thomas Kellerer 2015-08-02 15:20:08 Re: Getting the list of foreign keys (for deleting data from the database)