From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Unstable tests for recovery conflict handling |
Date: | 2022-07-26 18:16:11 |
Message-ID: | 20220726181611.4xw3blxigqzsz4d4@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-committers pgsql-hackers |
Hi,
On 2022-07-26 13:57:53 -0400, Tom Lane wrote:
> I happened to notice that while skink continues to fail off-and-on
> in 031_recovery_conflict.pl, the symptoms have changed! What
> we're getting now typically looks like [1]:
>
> [10:45:11.475](0.023s) ok 14 - startup deadlock: lock acquisition is waiting
> Waiting for replication conn standby's replay_lsn to pass 0/33FB8B0 on primary
> done
> timed out waiting for match: (?^:User transaction caused buffer deadlock with recovery.) at t/031_recovery_conflict.pl line 367.
>
> where absolutely nothing happens in the standby log, until we time out:
>
> 2022-07-24 10:45:11.452 UTC [1468367][client backend][2/4:0] LOG: statement: SELECT * FROM test_recovery_conflict_table2;
> 2022-07-24 10:45:11.472 UTC [1468547][client backend][3/2:0] LOG: statement: SELECT 'waiting' FROM pg_locks WHERE locktype = 'relation' AND NOT granted;
> 2022-07-24 10:48:15.860 UTC [1468362][walreceiver][:0] FATAL: could not receive data from WAL stream: server closed the connection unexpectedly
>
> So this is not a case of RecoveryConflictInterrupt doing the wrong thing:
> the startup process hasn't detected the buffer conflict in the first
> place.
I wonder if this, at least partially, could be be due to the elog thing
I was complaining about nearby. I.e. we decide to FATAL as part of a
recovery conflict interrupt, and then during that ERROR out as part of
another recovery conflict interrupt (because nothing holds interrupts as
part of FATAL).
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2022-07-26 18:30:30 | Re: Unstable tests for recovery conflict handling |
Previous Message | Tom Lane | 2022-07-26 17:57:53 | Re: Unstable tests for recovery conflict handling |
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2022-07-26 18:19:06 | Re: optimize lookups in snapshot [sub]xip arrays |
Previous Message | vignesh C | 2022-07-26 18:13:53 | Re: Handle infinite recursion in logical replication setup |