Re: COPY in Java?

From: Sam Gendler <sgendler(at)ideasculptor(dot)com>
To: Thomas Kellerer <spam_eater(at)gmx(dot)net>
Cc: PG-JDBC Mailing List <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: COPY in Java?
Date: 2016-07-19 00:08:25
Message-ID: CAEV0TzBcouyDTLwjLc-LgxXKn94WHu4jjeN_OCcUMJ+qd4eRyQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-jdbc

On Mon, Jul 18, 2016 at 2:16 PM, Thomas Kellerer <spam_eater(at)gmx(dot)net> wrote:

>
>
> I assume the first problem could be solved by implementing your own Reader
> that "massages" each line while it reads the source file.
>
> But I don't think there is an efficient way to "catch" the bad rows
> through the Copy API.
>

I've done exactly this. A long time ago, but so long as you don't mind
redoing the entire COPY without the bad rows, it works just fine. If you
get lots of failures in any given batch, you'll be better off doing
individual inserts in large transactions and catching the failures there,
or writing a more sophisticated reader which catches the bad rows and
prevents them from even being attempted.

>
> But you might be interested in http://pgloader.io/ which is a tool
> similar to SQL*Loader it supports more formats then COPY does.
> I have not worked with it however

Is Kettle still around (java-based ETL tool)? It always had its
difficulties, in my experience, but for basic bulk inserts with redirection
of bad rows to a log, it works very well. Not the fastest to execute
relative to implementing a custom COPY reader, but if you aren't already
dealing with overloaded infrastructure and don't need to absolutely
maximize efficiency, it will get you up and running really quickly (once
you are familiar with the tool and over the learning curve, of course) and
isn't database-specific, so much more portable. I found it really easy to
write Java/Groovy/jython code which could instantiate the kettle tools and
run them in the context of other tools/scripts, making it quite modular.
It was all a long time ago or I'd provide more detail. Sorry. And none of
that should discredit the pgloader.io suggestion. I just have no
experience of it.

--sam

In response to

Browse pgsql-jdbc by date

  From Date Subject
Next Message John R Pierce 2016-07-19 02:11:42 Re: COPY in Java?
Previous Message Thomas Kellerer 2016-07-18 21:16:49 Re: COPY in Java?