Re: Using the database to validate data

From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: Jon Lapham <lapham(at)jandr(dot)org>, pgsql-general(at)postgresql(dot)org
Subject: Re: Using the database to validate data
Date: 2015-07-23 20:48:36
Message-ID: 55B15324.4000205@aklaver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 07/23/2015 12:04 PM, Jon Lapham wrote:
> On 07/23/2015 03:02 PM, Adrian Klaver wrote:
>> http://pgloader.io/
>
> Ok, thanks, I'll look into pgloader's data validation abilities.
>
> However, my naive understanding of pgloader is that it is used to
> quickly load data into a database, which is not what I am looking to do.
> I want to validate data integrity *before* putting it into the database.
> If there is a problem with any part of the data, I don't want any of it
> in the database.

I misunderstood, I thought you just wanted information on the rows that
did not get in. pgloader does this by including the rejected data in
*.dat and the Postgres log of why it was rejected in *.log.

<Thinking out loud, not tested>

I could still see making use of this by using the --before <file_name>,
where file_name contains a CREATE TEMPORARY TABLE some_table script that
mimics the permanent table. Then it would load against the temporary
table, write out any errors and then drop the table at the end. This
would not put data into the permanent table on complete success though.
That would require some magic in AFTER LOAD EXECUTE that I have not come
up with yet:)

<Thinking out loud, not tested>
>
> -Jon
>

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2015-07-23 20:55:55 Re: The fastest way to update thousands of rows in moderately sized table
Previous Message twoflower 2015-07-23 20:17:46 The fastest way to update thousands of rows in moderately sized table