Re: BUG #17846: pg_dump doesn't properly dump with paused WAL replay

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Francisco Reinolds <francisco(dot)reinolds(at)channable(dot)com>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #17846: pg_dump doesn't properly dump with paused WAL replay
Date: 2023-03-20 15:39:44
Message-ID: 941275.1679326784@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

[ please keep the mailing list cc'd ]

Francisco Reinolds <francisco(dot)reinolds(at)channable(dot)com> writes:
> On 16-03-2023 16:10, Tom Lane wrote:
>> I really have no idea what's going on there, but can you show the exact
>> pg_dump command(s) being issued? I'm particularly curious whether you
>> are using parallel dump. The same for the failing pg_restore.

> Of course:

> - pg_dump: pg_dump --port 5432 --host localhost --verbose
> --format=directory --jobs=8 --file=<random_directory> --dbname=<dbname>
> - pg_restore: pg_restore --exit-on-error --cluster 13/<cluster_name>
> --dbname=<dbname> --port <port> --format=directory --jobs=8
> --use-list=/tmp/tmpsote5wvm --clean --if-exists <random directory>

Hmm, so the fact that the dump is being done in parallel is very likely
relevant. Perhaps parallelism on the restore is also relevant, not
sure. Can you try running each of those steps not-parallel to see
if the problem goes away?

I'm also slightly troubled by the --use-list option, and am wondering
if faulty creation of the restore list could be a contributing
factor. The error looks like missing data row(s) not missing schema
objects; but perhaps if the problematic table(s) are partitioned
then one could lead to the other? Could we see the DDL definition
for the problematic table(s)?

>> Also, are all the moving parts (primary server, secondary server,
>> pg_dump, pg_restore) exactly the same PG version?

> So, the version of both the primary and the secondary servers match, 13.8,
> but the server of the instance where we run the backup verifications does
> not, it's currently sitting at 13.6

Hmm. With some unsupported assumptions about your schema, I could
believe that some of the 13.9 bug fixes are relevant, particularly

* Fix construction of per-partition foreign key constraints while doing
ALTER TABLE ATTACH PARTITION (Jehan-Guillaume de Rorthais, Álvaro
Herrera)

Previously, incorrect or duplicate constraints could be constructed
for the newly-added partition.

regards, tom lane

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2023-03-20 16:21:49 Re: Clause accidentally pushed down ( Possible bug in Making Vars outer-join aware)
Previous Message PG Bug reporting form 2023-03-20 14:00:01 BUG #17855: Uninitialised memory used when the name type value processed in binary mode of Memoize