Multiple COPY statements for one table vs one for ~half a billion records

From: Carl L <cllewellyno(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Multiple COPY statements for one table vs one for ~half a billion records
Date: 2024-04-04 18:03:56
Message-ID: CAPtGvF9i5XunrgFUWYrCLnmnD0akdLKBQLdO1qsz9C5nz0m3ZQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi there,

I have around half a billion records that are being generated from a back
end that are split into 80 threads (one per core) and I'm performing a copy
from memory ( from stdin binary) into Postgres from each of these threads -
i.e. there are 80 COPY statements being generated for one table that are
running concurrently. I can see each of the Postgres processes sitting at
around 15% CPU usage.

These are all also in the same transaction - I am the only one connected,
so it's not an issue to hold a big transaction.

I can see that many of the Postgres threads have a wait event "LWLock:
BufferContent", which I assume means that they are waiting for each other
before they can write to the table. Therefore, would it be more efficient
to combine all of these and put them into one COPY statement?

Thanks!

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Ron Johnson 2024-04-04 18:15:40 Re: Multiple COPY statements for one table vs one for ~half a billion records
Previous Message Adrian Klaver 2024-04-04 15:34:49 Re: Moving delta data faster