From: | Ron Johnson <ronljohnsonjr(at)gmail(dot)com> |
---|---|
To: | pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Faster data load |
Date: | 2024-09-05 21:45:12 |
Message-ID: | CANzqJaCxizjVfdiDvxeGyrWUisefQ3y9Z3DVdA5rEU4PC5SLJw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Thu, Sep 5, 2024 at 4:14 PM Lok P <loknath(dot)73(at)gmail(dot)com> wrote:
> Hi,
>
> We are having a requirement to create approx 50 billion rows in a
> partition table(~1 billion rows per partition, 200+gb size daily
> partitions) for a performance test. We are currently using ' insert into
> <target table_partition> select.. From <source_table_partition> or <some
> transformed query>;' method . We have dropped all indexes and constraints
> First and then doing the load. Still it's taking 2-3 hours to populate one
> partition.
>
At three hours, that's 92,593 records/second. Seems pretty slow.
How much of that time is taken by <some transformed query>?
How big are the records?
How fast is the hardware?
Is there a faster way to achieve this?
>
Testing is the only way to know for sure.
> Few teammate suggesting to use copy command and use file load instead,
> which will be faster. So I wanted to understand, how different things it
> does behind the scenes as compared to insert as select command? As because
> it only deals with sql engine only.
>
COPY is highly optimized for buffered operation. INSERT... maybe not so
much.
But if the source data is already in a table, that would require piping the
data to stdout and then back into the database.
psql appdb -c "COPY (SELECT ...) TO STDOUT;" | psql appdb -c "COPY
some_table FROM STDOUT;". Use binary mode, so text conversion isn't
required.
Maybe that's faster, maybe not.
Additionally, when we were trying to create indexes post data load on one
> partition, it took 30+ minutes. Any possible way to make it faster?
>
> Is there any way to drive the above things in parallel by utilizing full
> database resources?
>
Put the destination tables in a different tablespace on a different
controller.
> It's postgres 15.4
>
Why not 15.8?
--
Death to America, and butter sauce.
Iraq lobster!
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Ross | 2024-09-05 22:21:09 | Re: Faster data load |
Previous Message | Lok P | 2024-09-05 20:14:00 | Faster data load |