From: | Simon Riggs <simon(at)2ndQuadrant(dot)com> |
---|---|
To: | Bruce Momjian <bruce(at)momjian(dot)us> |
Cc: | Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Emmanuel Cecchet <Emmanuel(dot)Cecchet(at)asterdata(dot)com>, Greg Smith <gsmith(at)gregsmith(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Selena Deckelmann <selenamarie(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: COPY enhancements |
Date: | 2009-10-08 23:30:50 |
Message-ID: | 1255044650.6335.15.camel@ebony |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, 2009-10-08 at 18:23 -0400, Bruce Momjian wrote:
> Dimitri Fontaine wrote:
> > Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> > > It will be best to have the ability to have a specific rejection reason
> > > for each row rejected. That way we will be able to tell the difference
> > > between uniqueness violation errors, invalid date format on col7, value
> > > fails check constraint on col22 etc..
> >
> > In case that helps, what pgloader does is logging into two files, named
> > after the table name (not scalable to server-side solution):
> > table.rej --- lines it could not load, straight from source file
> > table.rej.log --- errors as given by the server, plus pgloader comment
> >
> > The pgloader comment is necessary for associating each log line to the
> > source file line, as it's operating by dichotomy, the server always
> > report error on line 1.
> >
> > The idea of having two errors file could be kept though, the aim is to
> > be able to fix the setup then COPY again the table.rej file when it
> > happens the errors are not on the file content. Or for loading into
> > another table, with all columns as text or bytea, then clean data from a
> > procedure.
>
> What would be _cool_ would be to add the ability to have comments in the
> COPY files, like \#, and then the copy data lines and errors could be
> adjacent. (Because of the way we control COPY escaping, adding \# would
> not be a problem. We have \N for null, for example.)
That was my idea also until I heard Dimitri's two file approach.
Having a pristine data file and a matching error file means you can
potentially just resubmit the error file again. Often you need to do
things like trap RI errors and then resubmit them at a later time once
the master rows have entered the system.
--
Simon Riggs www.2ndQuadrant.com
From | Date | Subject | |
---|---|---|---|
Next Message | Simon Riggs | 2009-10-08 23:53:14 | Re: Hot Standby 0.2.1 |
Previous Message | Peter Eisentraut | 2009-10-08 23:23:43 | Re: Writeable CTEs and side effects |