From: | Lee Kindness <lkindness(at)csl(dot)co(dot)uk> |
---|---|
To: | Hannu Krosing <hannu(at)tm(dot)ee> |
Cc: | Lee Kindness <lkindness(at)csl(dot)co(dot)uk>, Patrick Welche <prlw1(at)newn(dot)cam(dot)ac(dot)uk>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Bulkloading using COPY - ignore duplicates? |
Date: | 2001-12-13 14:56:52 |
Message-ID: | 15384.49588.776088.349200@elsick.csl.co.uk |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hannu Krosing writes:
> Lee Kindness wrote:
> > The majority of database systems out there handle this situation in
> > one manner or another (MySQL ignores or replaces; Ingres ignores;
> > Oracle ignores or logs; others...). Indeed PostgreSQL currently checks
> > for duplicates in the COPY code but throws an elog(ERROR) rather than
> > ignoring the row, or passing the error back up the call chain.
> I guess postgresql will be able to do it once savepoints get
> implemented.
This is encouraging to hear. I can see how this would make the code
changes relatively minimal and more manageable - the changes to the
current code are simply over my head!
Are savepoints relatively high up on the TODO list, once 7.2 is out the
door?
> > My use of PostgreSQL is very time critical, and sadly this issue alone
> > may force an evaluation of Oracle's performance in this respect!
> Can't you clean the duplicates _outside_ postgresql, say
> cat dumpfile | sort | uniq | psql db -c 'copy mytable from stdin'
This is certainly a possibility, however it's just really moving the
processing elsewhere. The combined time is still around the same.
I've/we've done a lot of investigation with approaches like this and
also with techniques assuming the locality of the duplicates (which is
a no-goer). None improve the situation.
I'm not going to compare the time of just using INSERTs rather than
COPY...
Thanks for your response, Lee Kindness.
--
Lee Kindness, Senior Software Engineer, Concept Systems Limited.
http://services.csl.co.uk/ http://www.csl.co.uk/ +44 131 5575595
From | Date | Subject | |
---|---|---|---|
Next Message | Lee Kindness | 2001-12-13 15:00:53 | Re: Bulkloading using COPY - ignore duplicates? |
Previous Message | Thomas Lockhart | 2001-12-13 14:55:13 | Re: Third call for platform testing |