Re: COPY from .csv File and Remove Duplicates

From: Rich Shepard <rshepard(at)appl-ecosys(dot)com>
To: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: COPY from .csv File and Remove Duplicates
Date: 2011-08-12 00:00:32
Message-ID: alpine.LNX.2.00.1108111656050.14240@salmo.appl-ecosys.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, 11 Aug 2011, David Johnston wrote:

> If you have duplicates with matching real keys inserting into a staging
> table and then moving new records to the final table is your best option
> (in general it is better to do a two-step with a staging table since you
> can readily use Postgresql to perform any intermediate translations) As
> for the import itself,

David,

I presume what you call a staging table is what I refer to as a copy of
the main table, but with no key attribute.

Writing the SELECT statement to delete from the staging table those rows
that already exist in the main table is where I'm open to suggestions.

> In this case I would just import the data to a staging table without any
> kind of artificial key, just the true key,

There is no true key, only an artificial key so I can ensure that rows are
unique. That's in the main table with the 50K rows. No key column in the
.csv file.

Thanks,

Rich

In response to

Browse pgsql-general by date

  From Date Subject
Next Message David Johnston 2011-08-12 00:02:45 Re: Regex Query Index question
Previous Message David Johnston 2011-08-11 23:58:08 Re: Regex Query Index question