Max write throughput for single COPY

From: Joe Wildish <joe(at)lateraljoin(dot)com>
To: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Max write throughput for single COPY
Date: 2023-01-06 21:09:54
Message-ID: 140B5BD1-A9B4-4D4D-B6A9-17185A9BA32C@lateraljoin.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hello,

I am trying to speed up the initial logical replication sync process. The database being replicated is dominated by one table that is 750GB (heap). The process quickly boils down to a single COPY writing into the subscriber. We have dropped all indexes and key constraints in the subscriber but we are seeing the sync take >24 hours before we have to kill the process (to avoid so much WAL being reserved on the publisher).

I haven’t done a huge amount of performance tuning at the Linux level before as I’m used to working with cloud-managed installations where you obviously don’t have access to the underlying host. However, in this case, the subscriber instance is not a cloud-managed one.

Can anyone give comment on what might be a reasonable throughput in MB/s for a single COPY operation?

The material I’ve read on I/O talks about saturating the device … I’m pretty sure that a single COPY operation is not capable of doing this. It’s therefore one thing to see the advertised top-line figures about IOPS and throughput, vs what you can actually do with the single COPY. I’d be interested in hearing what other people are able to get as a throughput figure for COPY.

-Joe

Browse pgsql-performance by date

  From Date Subject
Next Message spiral 2023-01-08 12:02:01 Advice on best way to store a large amount of data in postgresql
Previous Message MichaelDBA 2023-01-05 11:46:49 Re: How to analyze of short but heavy intermittent slowdown on BIND on production database (or BIND vs log_lock_waits)