Re: pg_dump performance issues

From: Dominique Devienne <ddevienne(at)gmail(dot)com>
To: Francisco Olarte <folarte(at)peoplecall(dot)com>
Cc: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: pg_dump performance issues
Date: 2024-02-22 14:40:23
Message-ID: CAFCRh-8SdPF8fY417WyJ_cmtOKM=3AhaaQSqcwhztu3+N-qtuw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, Feb 22, 2024 at 12:20 PM Dominique Devienne <ddevienne(at)gmail(dot)com>
wrote:

> On Thu, Feb 22, 2024 at 11:41 AM Francisco Olarte <folarte(at)peoplecall(dot)com>
> wrote:
>
>> Dominique:
>> > .... in the 10-12MB/s throughput range.....
>>
>> This has the faint smell of a saturated 100Mbps link in the middle
>> (12*8=96Mbps)
>>
>
> So indeed, I'm not on 1Gb/s as I expected... I'll report back when this is
> fixed.
>

So IT fixing the networking on my Windows machine is slower than I
expected...

Thus I switched to a Linux client, which I verified using iperf3 really has
1Gbps.
And there I can read at 100MB/s, which is almost saturating the network
link.
The Linux dumps are 45MB this time, instead of 100MB on Windows, as if the
former is compressed, and the latter isn't.
Perhaps the (implicit?) compression of the dump is why it's 8x slower than
mine? (I do fewer catalog queries I suspect)
And I see no speedup from parallelizing, unfortunately, which I find
strange, as mentioned earlier. Any idea why?
Note that this is against the v12 server, and the client is still 16.1.

Applying a two-threaded producer-consumer scheme to my dump, to parallelize
reads and writes, I might approach 100MB/s overall.
Before getting into parallelizing reads, which complicate matters, and my
dump container's API is serial anyway (it's SQLite...)

So looks like I'm networking bound on 1Gbps, while Marc perhaps reached
PostgreSQL's (or the disks'?) limits before saturating his 10Gbps
networking.

Thanks, --DD

[ddevienne]$ my_dump ...
COPY: 25x ( 88,073 rows, 56,768,104 bytes) in 0.537s (100.7
MB/s)
DUMP'd 88,023 rows (56,761,583 bytes) from 25 tables in 1.062s (51.0 MB/s)

[ddevienne]$ ll ....db
-rw-r--r-- 1 ddevienne users 58,351,616 Feb 22 15:10 ....db

[ddevienne]$ time $PDGM_POSTGRESQL_DIR/bin/pg_dump --role="..." -Fc -n
'"..."' -f ...-Fc.dump "host=acme dbname=copyperf"
7.561u 0.216s 0:08.46 91.8% 0+0k 0+91056io 0pf+0w
[ddevienne]$ time $PDGM_POSTGRESQL_DIR/bin/pg_dump --role="..." -Fd -n
'"..."' -f ...-Fd.dump "host=acme dbname=copyperf"
7.351u 0.219s 0:08.33 90.7% 0+0k 0+90808io 0pf+0w
[ddevienne]$ time $PDGM_POSTGRESQL_DIR/bin/pg_dump --role="..." -Fd -j2 -n
'"..."' -f ...-Fdj2.dump "host=acme dbname=copyperf"
7.667u 0.228s 0:08.56 92.0% 0+0k 0+90808io 0pf+0w
[ddevienne]$ time $PDGM_POSTGRESQL_DIR/bin/pg_dump --role="..." -Fd -j3 -n
'"..."' -f ...-Fdj3.dump "host=acme dbname=copyperf"
7.964u 0.247s 0:08.71 94.1% 0+0k 0+90808io 0pf+0w
[ddevienne]$ time $PDGM_POSTGRESQL_DIR/bin/pg_dump --role="..." -Fd -j4 -n
'"..."' -f ...-Fdj4.dump "host=acme dbname=copyperf"
7.491u 0.243s 0:08.21 94.1% 0+0k 0+90808io 0pf+0w
[ddevienne]$ du -sm ...-F*
45 ...-Fc.dump
45 ...-Fd.dump
45 ...-Fdj2.dump
45 ...-Fdj3.dump
45 ...-Fdj4.dump

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Greg Sabino Mullane 2024-02-22 14:53:18 Re: Postgresql assistance needed
Previous Message Moreno Andreo 2024-02-22 14:14:53 Partial table duplication via triggger