Save failed records into auxiliary table

From: Il Mimo di Creta <mimo(dot)creta(at)gmail(dot)com>
To: pgsql-admin(at)lists(dot)postgresql(dot)org
Subject: Save failed records into auxiliary table
Date: 2020-11-12 09:00:44
Message-ID: CAPDeFrecDbyXrKkU6FCZnvcE1sq1bj27cZoP657Sh4nLVE1tgA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Hello Everyone,

I am writing to ask help about a use case I have to set up: if anyone can
provide me with any suggestions, I would be really grateful.

This is my use case:
I have a table, with a primary key composed of two columns, each one, of
course, with a not null constraint.
Each night a bunch of data will be loaded from an external ETL, which is
out of my control.
Since I know there might be data quality issues, such as duplicated records
or invalid records with null values on not-null columns, I would like to
save such records, and only such records, in an auxiliary table.

Currently, I tried two solutions:
1) Trigger on insert: but I could not have the insert into the auxiliary
table working, because the constraints violation triggers a rollback
2) Rule on insert, with a "DO INSTEAD" condition, which works, but, in case
of invalid records, copies all the row of the transaction in the auxiliary
table and not only the invalid ones.

This is what I did:

Let's assume that my table is actually name "mytable", with pk composed of
(col1,col2).

The auxiliary table is mytable_failures, which has the same columns of
mytable, no constraints and an additional column "fail_reason".

These are the rules:
CREATE or REPLACE RULE insert_mytable_nulls AS
ON INSERT TO mytable
where (length(trim(NEW.col1))=0 OR
length(trim(NEW.col2))=0 OR
NEW.col1 IS NULL OR
NEW.col2 IS NULL)
DO INSTEAD
insert into mytable_failures values(NEW.*,'col1 and col2 cannot be null');

CREATE or REPLACE RULE insert_mytable_pkey AS
ON INSERT TO mytable
where (select true from mytable where col1=NEW.col1 and col2=NEW.col2)
DO INSTEAD
insert into mytable_failures values(NEW.*,'Primary Key violation');

If I execute the following transaction batch:

INSERT INTO mytable(
col1, col2, col3, col3, last_update)
VALUES ('UK', 'FB00004', 'en', now());
INSERT INTO mytable(
col1, col2, col3, col3, last_update)
VALUES ('UK', 'FB00005', 'en', now());
INSERT INTO mytable(
col1, col2, col3, col3, last_update)
VALUES ('UK', 'FB00004', 'en', now());

1) the first two rows are correctly inserted in mytable and the third
discarded
2) incorrectly, all the three rows in the mytable_failures.

Thank you for all the help you can provide

Mimo

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Juan José Santamaría Flecha 2020-11-12 11:03:10 Re: Two postgres masters
Previous Message Yambu 2020-11-12 07:51:40 Two postgres masters