Re: COPY FROM - to avoid WAL generation

From: Jeff Ross <jross(at)openvistas(dot)net>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: COPY FROM - to avoid WAL generation
Date: 2018-08-21 15:26:11
Message-ID: d69dd58e-f27b-5878-f4b3-351a60ae8fd8@openvistas.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 8/21/18 9:00 AM, Ravi Krishna wrote:

> In a recent thread of mine I learned something very interesting. If a
> table is created and data is loaded via COPY FROM within the same
> transaction, then PG will be smart enough to not generate WAL logs
> because all it needs to do is to track the status of the transaction
> and let the data load go to the new data file created for the table. 
> If committed, the table is released for other sessions, if rolledback,
> vaccum will delete the data file later on.
>
> I tested it as follows for a table with 50 milllion rows. No indexes.
>
> Case 1
>   - create the table first.
>   - in a separate transaction load the 50 million rows.
>
> Took 3 min 22 seconds
>
> Case 2
>   - start transaction
>   - create table
>   - load 50 million rows
>   - commit transaction
>
> Took: 3 min 16 seconds.
>
> Am I missing anything?
>
Have you looked into pg_bulkload?

https://github.com/ossc-db/pg_bulkload

Docs are here:

http://ossc-db.github.io/pg_bulkload/index.html

Jeff

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Andres Freund 2018-08-21 15:31:28 Re: COPY FROM - to avoid WAL generation
Previous Message Ravi Krishna 2018-08-21 15:00:03 COPY FROM - to avoid WAL generation