Re: error-tolerant COPY FROM

From: Sean Davis <sdavis2(at)mail(dot)nih(dot)gov>
To: "Joolz" <joolz(at)arbodienst-limburg(dot)nl>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: error-tolerant COPY FROM
Date: 2005-02-04 13:15:56
Message-ID: E74473BD-76AE-11D9-8DF0-000D933565E8@mail.nih.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I use a trigger on tables with foreign key references to either ignore
the insert row or insert an appropriate matching row in the referenced
table, if it does not exist. In the function, I just raise a notice
that I am doing this. This is a simple example:
create or replace function tgf_insert_gene_id() returns trigger as $$
declare
gene_id_chk integer;
begin
select into gene_id_chk gene_id from g_main where
gene_id=NEW.gene_id;
if (NOT FOUND) THEN
BEGIN
INSERT into g_main (gene_id,name) values
(NEW.gene_id,'INSERTED FOR FOREIGN KEY CHECK');
RAISE NOTICE 'Inserting a gene ID to meet foreign key
requirements';
END;
END IF;
return NEW;
end;$$ language plpgsql;

Just make a trigger for the tables of interest (I have several tables
that reference gene_id in g_main) so they can all use the same trigger.

This function simply looks up the proposed key to the foreign key table
(gene_id) in the foreign key table. If it is (NOT FOUND) then it
inserts the key into the foreign table (g_main) the gene_id about to be
inserted and then raises a notice (which goes into the log if you have
logging set up to log notices). Then, the function returns and,
because we did the insert, foreign key checks work fine. You could
just as easily choose to NOT insert a record that doesn't meet foreign
key checks, or even do an insert into a separate table instead that
contains all of the rows that don't match a foreign key.

Hope this helps....
Sean

On Feb 4, 2005, at 7:32 AM, Joolz wrote:

> Hello everyone,
>
> I'm building a postgresql db which will have to get lots of data
> from "the outside" (customers, that is). The db has lots of
> constraints, and I'm sure that our customers will offer lots of
> invalid information. We receive the information in csv format. My
> first thought was to read them into the database with COPY, but
> "COPY stops operation at the first error."
>
> What I need is an import where all valid lines from the csv files
> are read into the db, and I also get a logfile for all invalid
> lines, stating the line number plus the pg error message so I can
> see which constraint was violated.
>
> I can't think of a direct, elegant solution for this, does anyone
> have any suggestions? Thanks a lot!
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Csaba Nagy 2005-02-04 13:23:52 Re: error-tolerant COPY FROM
Previous Message Joolz 2005-02-04 13:06:19 Re: error-tolerant COPY FROM