From: | Andrew - Supernews <andrew+nonews(at)supernews(dot)com> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Modifying COPY TO |
Date: | 2005-02-26 04:07:41 |
Message-ID: | slrnd1vtgd.2dej.andrew+nonews@trinity.supernews.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2005-02-25, "Dave Held" <dave(dot)held(at)arrayservicesgrp(dot)com> wrote:
> A possibility that I would like to pursue is to keep the staging data
> from the previous day, do a COPY TO, import the new data into
> another staging table with a COPY FROM, then export the fresh
> data with another COPY TO. Then, I can write a fast C/C++
> program to do a line-by-line comparison of each record, isolating
> the ones that have changed from the previous day. I can then
> emit those records in a change file that should be relatively small
> and easy to update.
I have an application that does something like this, but rather than use an
external program, I do the comparison in the database itself:
- import data from external system into a temporary table
- compare the temporary table against the live data (a full outer join
is a convenient way of doing this - I create an index on the temp table
first)
- perform insert/update/delete for each record that was added, changed
or removed
In my case the compare/update is in a pl/pgsql function. My data is only
2-3 million rows, a bit smaller than yours, but I have to update hourly,
not daily, and spend no more than 5-10 minutes on each update (currently
I can do it in 5: 2 to load the data, 3 to do the compare/update).
--
Andrew, Supernews
http://www.supernews.com - individual and corporate NNTP services
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2005-02-26 04:46:39 | Re: Modifying COPY TO |
Previous Message | Jim C. Nasby | 2005-02-26 02:35:02 | Re: idea for concurrent seqscans |