From: | Dimitrios Apostolou <jimis(at)gmx(dot)net> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-performance(at)lists(dot)postgresql(dot)org |
Subject: | Re: parallel pg_restore blocks on heavy random read I/O on all children processes |
Date: | 2025-03-20 19:57:45 |
Message-ID: | e11dd37a-3409-175e-040d-c773838b7934@gmx.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On Thu, 20 Mar 2025, Tom Lane wrote:
> I am betting that the problem is that the dump's TOC (table of
> contents) lacks offsets to the actual data of the database objects,
> and thus the readers have to reconstruct that information by scanning
> the dump file. Normally, pg_dump will back-fill offset data in the
> TOC at completion of the dump, but if it's told to write to an
> un-seekable output file then it cannot do that.
Thanks Tom, this makes sense! As you noticed, I'm piping the output, and
this was a conscious choice.
> I don't see an easy way, and certainly no way that wouldn't involve
> redefining the archive format. Can you write the dump to a local
> file rather than piping it immediately?
Unfortunately I don't have enough space for that. I'm still testing, but
the way this is designed to work is to take an uncompressed pg_dump
(unlike the above which was compressed for testing purposes) and send it
to a backup server having its own deduplication and compression.
Further questions:
* Does the same happen in an uncompressed dump? Or maybe the offsets are
pre-filled because they are predictable without compression?
* Should pg_dump print some warning for generating a lower quality format?
* The seeking pattern in pg_restore seems non-sensical to me: reading 4K,
jumping 8-12K, repeat for the whole file? Consuming 15K IOPS for an
hour. /Maybe/ something to improve there... Where can I read more about
the format?
* Why doesn't it happen in single-process pg_restore?
Thank you!
Dimitris
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2025-03-23 15:46:42 | Re: parallel pg_restore blocks on heavy random read I/O on all children processes |
Previous Message | Tom Lane | 2025-03-20 19:17:17 | Re: parallel pg_restore blocks on heavy random read I/O on all children processes |