Quick Links

Re: Bulkloading using COPY - ignore duplicates?

From:	Lee Kindness <lkindness(at)csl(dot)co(dot)uk>
To:	Patrick Welche <prlw1(at)newn(dot)cam(dot)ac(dot)uk>
Cc:	Lee Kindness <lkindness(at)csl(dot)co(dot)uk>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Bulkloading using COPY - ignore duplicates?
Date:	2001-12-13 15:44:31
Message-ID:	15384.52447.737930.610518@elsick.csl.co.uk
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Patrick Welche writes:
> On Thu, Dec 13, 2001 at 01:25:11PM +0000, Lee Kindness wrote:
> > That's what I'm currently doing as a workaround - a SELECT DISTINCT
> > from a temporary table into the real table with the unique index on
> > it. However this takes absolute ages - say 5 seconds for the copy
> > (which is the ballpark figure I aiming toward and can achieve with
> > Ingres) plus another 30ish seconds for the SELECT DISTINCT.
> Then your column really isn't unique,

That's another discussion entirely ;) - it's spat out by a real-time
system which doesn't have the time or resources to check this. Further
precision loss later in the data's life adds more duplicates...

> so how about dropping the unique index, import the data, fix the
> duplicates, recreate the unique index - just as another possible
> work around ;)

This is just going to be the same(ish) time, no?

CREATE TABLE tab (p1 INT, p2 INT, other1 INT, other2 INT);
COPY tab FROM 'file';
DELETE FROM tab WHERE p1, p2 NOT IN (SELECT DISTINCT p1, p2
FROM tab);
CREATE UNIQUE INDEX tab_idx ON tab USING BTREE(p1, p2);

or am I missing something?

Thanks, Lee.

--
Lee Kindness, Senior Software Engineer, Concept Systems Limited.
http://services.csl.co.uk/ http://www.csl.co.uk/ +44 131 5575595

In response to

Re: Bulkloading using COPY - ignore duplicates? at 2001-12-13 15:29:57 from Patrick Welche

Responses

Re: Bulkloading using COPY - ignore duplicates? at 2001-12-13 16:22:59 from Ross J. Reedstrom

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Zeugswetter Andreas SB SD	2001-12-13 15:47:10	Re: [HACKERS] Platform Testing - Cygwin
Previous Message	Patrick Welche	2001-12-13 15:29:57	Re: Bulkloading using COPY - ignore duplicates?