Re: COPY from STDIN vs file with large CSVs

From: bricklen <bricklen(at)gmail(dot)com>
To: Wells Oliver <wells(dot)oliver(at)gmail(dot)com>
Cc: pgsql-admin <pgsql-admin(at)postgresql(dot)org>
Subject: Re: COPY from STDIN vs file with large CSVs
Date: 2020-01-08 20:24:17
Message-ID: CAGrpgQ-zFD2aBz_+w4V5UWMH1wGOhjefy7bPz4S-VTcr52dusg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

On Wed, Jan 8, 2020 at 8:55 AM Wells Oliver <wells(dot)oliver(at)gmail(dot)com> wrote:

> I have a CSV that's ~30GB. Some 400m rows. Would there be a meaningful
> performance difference to run COPY from STDIN using: cat f.csv | psql "COPY
> .. FROM STDIN WITH CSV" versus just doing "COPY ... FROM 'f.csv' WITH CSV"?
>

If you're looking to speed up the loading - and your disk subsystem is
decent - consider running your csv through the "split" command to decompose
it into smaller CSV files. You can then load them in parallel using
multiple psql sessions.

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message bvo 2020-01-09 14:19:26 misbehavior slave
Previous Message David G. Johnston 2020-01-08 18:33:51 Re: COPY from STDIN vs file with large CSVs