Re: Archiving Data to Another DB?

From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: Don Seiler <don(at)seiler(dot)us>, pgsql-general(at)postgresql(dot)org
Subject: Re: Archiving Data to Another DB?
Date: 2018-04-11 17:58:10
Message-ID: ed6050c8-bd18-87c9-9522-ea1f2c07a5c8@aklaver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 04/11/2018 09:15 AM, Don Seiler wrote:
> Let's say I have two DBs: main (9.6.6) and archive (10.2).
>
> I have a table in main where I want to archive data older then 60 days.
> For various reasons, the table is not partitioned, so for now we must
> use DELETE. The destination table in the archive DB is partitioned with
> the new Pg10 partitioning.
>
> My initial plan was to have a stored procedure on the archive DB use
> postgres_fdw to do an INSERT INTO / SELECT to select the data remotely
> and insert into the local archive table. It would then issue a single
> DELETE command to remotely remove the data from the main DB. However I
> found that doing this resulted in the main DB calling thousands (perhaps
> millions if it's one-per-row) of individual DELETE statements based on a
> ctid column. Aside from WAL behavior concerns, it is flooding my
> postgresql server logs since I log any DML.
>
> On top of that, I'm told that a remote DELETE wouldn't be transactional,
> so if I were to compare inserted rows vs deleted rows and found a
> mismatch, I couldn't just rollback the DELETE. I plan to verify this
> with a small test case later but for now I'll assume this to be true.

That would be worth testing:

https://www.postgresql.org/docs/9.6/static/postgres-fdw.html

"F.33.3. Transaction Management

During a query that references any remote tables on a foreign server,
postgres_fdw opens a transaction on the remote server if one is not
already open corresponding to the current local transaction. The remote
transaction is committed or aborted when the local transaction commits
or aborts. Savepoints are similarly managed by creating corresponding
remote savepoints.

..."

>
> Right now I'm thinking of falling back to the far-less-elegant method of
> dumping the data to a flat file via COPY, running psql to connect to the
> archive DB remotely and running a COPY to load the data (or maybe
> transferring the flat file to the archive DB to load it there,
> offloading that part of the workload), then deleting the data from the
> main DB. I could capture the rows dumped in a control table and compare
> the rows deleted against that and then rollback the delete if necessary.

I may be missing something, but why not reverse your original set up?
Assuming transactional behavior works as expected something like:

1) Setup postgres_fdw in main database.

2) Create FOREIGN TABLE pointing to table in archive database.

3) INSERT INTO/SELECT from main table to archive table.

4) DELETE FROM main table.

>
> Like I said, not elegant, but I don't want to risk losing data that
> wasn't successfully archived to the archive DB. I'm very interested to
> hear what others might be doing for tasks like this.
>
> Don.
>
> --
> Don Seiler
> www.seiler.us <http://www.seiler.us>

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Don Seiler 2018-04-11 18:13:31 Re: Archiving Data to Another DB?
Previous Message Rob Sargent 2018-04-11 17:37:08 Re: Archiving Data to Another DB?