Quick Links

Re: Bulkloading using COPY - ignore duplicates?

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc:	Lee Kindness <lkindness(at)csl(dot)co(dot)uk>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Jim Buttafuoco <jim(at)buttafuoco(dot)net>, PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Bulkloading using COPY - ignore duplicates?
Date:	2002-01-02 22:18:27
Message-ID:	7632.1010009907@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> I think we can allow something like:
> COPY FROM '/tmp/x' WITH ERRORS 2

> Yes, I realize we need subtransactions or something, but we should add
> it to the TODO list if it is a valid request, right?

Well, I don't like that particular API in any case. Why would I think
that 2 errors are okay and 3 are not, if I'm loading a
many-thousand-line COPY file? Wouldn't it matter *what* the errors
are, at least as much as how many there are? "Discard duplicate rows"
is one thing, but "ignore bogus data" (eg, unrecognizable timestamps)
is not the same animal at all.

As someone already remarked, the correct, useful form of such a feature
is to echo the rejected lines to some sort of output file that I can
look at afterwards. How many errors there are is not the issue.

regards, tom lane

In response to

Re: Bulkloading using COPY - ignore duplicates? at 2002-01-02 22:02:26 from Bruce Momjian

Responses

Re: Bulkloading using COPY - ignore duplicates? at 2002-01-02 23:40:20 from Bruce Momjian

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Laurette Cisneros	2002-01-02 22:25:35	Re: bug in join?
Previous Message	Bruce Momjian	2002-01-02 22:05:29	Re: software license question