Re: performance of loading CSV data with COPY is 50 times faster than Perl::DBI

From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: Matthias Apitz <guru(at)unixarea(dot)de>, pgsql-general(at)postgresql(dot)org
Subject: Re: performance of loading CSV data with COPY is 50 times faster than Perl::DBI
Date: 2020-01-31 18:32:27
Message-ID: 52a2d5e7-8443-25fb-af28-67f460a5e61e@aklaver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 1/31/20 10:24 AM, Matthias Apitz wrote:
>
> Hello,
>
> Since ages, we transfer data between different DBS (Informix, Sybase,
> Oracle, and now PostgreSQL) with our own written tool, based on
> Perl::DBI which produces a CSV like export in a common way, i.e. an
> export of Oracle can be loaded into Sybase and vice versa. Export and
> Import is done row by row, for some tables millions of rows.
>
> We produced a special version of the tool to export the rows into a
> format which understands the PostgreSQL's COPY command and got to know
> that the import into PostgreSQL of the same data with COPY is 50 times
> faster than with Perl::DBI, 2.5 minutes ./. 140 minutes for around 6
> million rows into an empty table without indexes.
>
> How can COPY do this so fast?

Well for one thing COPY does everything in a single transaction, which
is both good and bad. The good is that it is fast, the bad is that a
single error will rollback the entire operation.

COPY also uses it's own method for transferring data. For all the
details see:

https://www.postgresql.org/docs/12/protocol-flow.html#PROTOCOL-COPY

>
> matthias
>

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Pavel Stehule 2020-01-31 19:02:28 Re: performance of loading CSV data with COPY is 50 times faster than Perl::DBI
Previous Message Matthias Apitz 2020-01-31 18:24:41 performance of loading CSV data with COPY is 50 times faster than Perl::DBI