efficiently migrating 'old' data from one table to another

From: Jonathan Vanasco <postgres(at)2xlp(dot)com>
To: pgsql-general general <pgsql-general(at)postgresql(dot)org>
Subject: efficiently migrating 'old' data from one table to another
Date: 2017-01-12 20:06:14
Message-ID: 0CF75CDE-93CC-4E46-9D0A-0A91C4F9C4F0@2xlp.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I'm just wondering if there's a more efficient way of handling a certain periodic data migration.

We have a pair of tables with this structure:

table_a__live
column_1 INT
column_2 INT
record_timestamp TIMESTAMP

table_a__archive
column_1 INT
column_2 INT
record_timestamp TIMESTAMP

periodically, we must migrate items that are 'stale' from `table_a__live ` to `table_a__archive`. The entries are copied over to the archive, then deleted.

The staleness is calculated based on age-- so we need to use INTERVAL. the "live" table can have anywhere from 100k to 20MM records.

the primary key on `table_a__live` is a composite of column_1 & column_2,

In order to minimize scanning the table, we opted to hint migrations with a dedicated column:

ALTER TABLE table_a__live ADD is_migrate BOOLEAN DEFAULT NULL;
CREATE INDEX idx_table_a__live_migrate ON table_a__live(is_migrate) WHERE is_migrate IS NOT NULL;

so our migration is then based on that `is_migrate` column:

BEGIN;
UPDATE table_a__live SET is_migrate = TRUE WHERE record_timestamp < transaction_timestamp() AT TIME ZONE 'UTC' - INTERVAL '1 month';
INSERT INTO table_a__archive (column_1, column_2, record_timestamp) SELECT column_1, column_2, record_timestamp FROM table_a__live WHERE is_migrate IS TRUE;
DELETE FROM table_a__live WHERE is_migrate IS TRUE;
COMMIT;

The inserts & deletes are blazing fast, but the UPDATE is a bit slow from postgres re-writing all the rows.

can anyone suggest a better approach?

I considered copying everything to a tmp table then inserting/deleting based on that table -- but there's a lot of disk-io on that approach too.

fwiw we're on postgres9.6.1

Responses

Browse pgsql-general by date

  From Date Subject
Next Message btober@computer.org 2017-01-12 20:19:19 Re: efficiently migrating 'old' data from one table to another
Previous Message Jonathan Vanasco 2017-01-12 17:09:49 Re: temporarily disable autovacuum on a database or server ?