Re: Faster data load

From: Jeff Ross <jross(at)openvistas(dot)net>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Faster data load
Date: 2024-09-05 22:21:09
Message-ID: 640e21b1-a841-40ae-a5eb-e6ee548520a1@openvistas.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 9/5/24 14:14, Lok P wrote:

> Hi,
>
> We are having a requirement to create approx 50 billion rows in a
> partition table(~1 billion rows per partition, 200+gb size daily
> partitions) for a performance test. We are currently using ' insert
> into <target table_partition> select.. From <source_table_partition>
> or <some transformed query>;' method . We have dropped all indexes and
> constraints First and then doing the load. Still it's taking 2-3 hours
> to populate one partition. Is there a faster way to achieve this?
>
> Few teammate suggesting to use copy command and use file load instead,
> which will be faster. So I wanted to understand, how different things
> it does behind the scenes as compared to insert as select command? As
> because it only deals with sql engine only.
>
> Additionally, when we were trying to create indexes post data load on
> one partition, it took 30+ minutes. Any possible way to make it faster?
>
> Is there any way to drive the above things in parallel by utilizing
> full database resources?
>
> It's postgres 15.4
>
> Regards
> Lok

Try pg_bulkload to load the data--takes a little set up but it is very
fast.  Do pay attention to the caveats.  For a performance test they
probably won't be relevant.

https://github.com/ossc-db/pg_bulkload?tab=readme-ov-file

Jeff

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Muhammad Usman Khan 2024-09-06 03:49:51 Re: Faster data load
Previous Message Ron Johnson 2024-09-05 21:45:12 Re: Faster data load