From: | "Peter J(dot) Holzer" <hjp-pgsql(at)hjp(dot)at> |
---|---|
To: | pgsql-general(at)lists(dot)postgresql(dot)org |
Subject: | Re: Faster data load |
Date: | 2024-09-08 17:45:39 |
Message-ID: | 20240908174539.teeanjwrthkjm4ti@hjp.at |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On 2024-09-06 01:44:00 +0530, Lok P wrote:
> We are having a requirement to create approx 50 billion rows in a partition
> table(~1 billion rows per partition, 200+gb size daily partitions) for a
> performance test. We are currently using ' insert into <target table_partition>
> select.. From <source_table_partition> or <some transformed query>;' method .
> We have dropped all indexes and constraints First and then doing the load.
> Still it's taking 2-3 hours to populate one partition.
That seems quite slow. Is the table very wide or does it have a large
number of indexes?
> Is there a faster way to achieve this?
>
> Few teammate suggesting to use copy command and use file load instead, which
> will be faster.
I doubt that.
I benchmarked several strategies for populating tables 5 years ago and
(for my test data and on our hardware at the time - YMMV) s simple
INSERT ... SELECT was more than twice as fast as 8 parallel COPY
operations (and about 8 times as fast as a single COPY).
Details will have changed since then (I should rerun that benchmark on
a current system), but I'd be surprised if COPY became that much faster
relative to INSERT ... SELECT.
hp
--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp(at)hjp(dot)at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"
From | Date | Subject | |
---|---|---|---|
Next Message | Adrian Klaver | 2024-09-08 18:26:42 | Re: Faster data load |
Previous Message | Laurenz Albe | 2024-09-08 14:18:32 | Re: How to cleanup transaction after statement_timeout aborts a query? |