Re: 039_end_of_wal: error in "xl_tot_len zero" test

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: Anton Voloshin <a(dot)voloshin(at)postgrespro(dot)ru>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, David Rowley <dgrowleyml(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: 039_end_of_wal: error in "xl_tot_len zero" test
Date: 2024-08-29 05:41:36
Message-ID: CA+hUKGKLexHVHm0ZcN6NpKnGAsGxOczca-MGu=GGVdNjsRCsRQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Aug 24, 2024 at 10:43 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> On Sat, Aug 24, 2024 at 10:33 AM Nathan Bossart
> <nathandbossart(at)gmail(dot)com> wrote:
> > I am seeing the exact problem described in this thread on my laptop since
> > commit 490f869. I have yet to do a thorough investigation, but what I've
> > seen thus far does seem to fit the subtle-differences-in-generated-WAL
> > theory. If no one is planning to pick up the fix soon, I will try to.
>
> Sorry for dropping that. It looks like we know approximately how to
> stabilise it, and I'll look at it early next week if you don't beat me
> to it, but please feel free if you would like to.

It fails reliably if you nail down the initial conditions like this:

$TLI = $node->safe_psql('postgres',
"SELECT timeline_id FROM pg_control_checkpoint();");

+$node->safe_psql('postgres', "SELECT pg_switch_wal();");
+emit_message($node, 7956);
+
my $end_lsn;
my $prev_lsn;

The fix I propose to commit shortly is just the first of those new
lines, to homogenise the initial state. See attached. The previous
idea works too, I think, but this bigger hammer is more obviously
removing variation.

Attachment Content-Type Size
0001-Stabilize-039_end_of_wal-test.patch text/x-patch 1.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-08-29 05:55:27 Re: 039_end_of_wal: error in "xl_tot_len zero" test
Previous Message Zhijie Hou (Fujitsu) 2024-08-29 05:36:40 RE: Collect statistics about conflicts in logical replication