From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Alexander Lakhin <exclusion(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Why is src/test/modules/committs/t/002_standby.pl flaky? |
Date: | 2022-02-13 01:53:10 |
Message-ID: | 20220213015310.rjipagnmt2x7mmqs@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2022-02-12 11:47:20 -0500, Tom Lane wrote:
> Alexander Lakhin <exclusion(at)gmail(dot)com> writes:
> > 11.02.2022 05:22, Andres Freund wrote:
> >> Over in another thread I made some wild unsubstantiated guesses that the
> >> windows issues could have been made much more likely by a somewhat odd bit of
> >> code in PQisBusy():
> >> https://postgr.es/m/1959196.1644544971%40sss.pgh.pa.us
> >> Alexander, any chance you'd try if that changes the likelihood of the problem
> >> occurring, without any other fixes / reverts applied?
>
> > Unfortunately I haven't seen an improvement for the test in question.
Thanks for testing!
> Yeah, that's what I expected, sadly. While I think this PQisBusy behavior
> is definitely a bug, it will not lead to an infinite loop, just to write
> failures being reported in a less convenient fashion than intended.
FWIW, I didn't think it'd end up looping indefinitely, but that there's a
chance it could end up waiting indefinitely. The WaitLatchOrSocket() doesn't
have a timeout, and if I understand the windows FD_CLOSE stuff correctly,
you're not guaranteed to get an event if you do WaitForMultipleObjects if
FD_CLOSE was already consumed and if there isn't any data to read.
ISTM that it's not a great idea for libpqrcv_receive() to do blocking IO at
all. The caller expects it to not block...
> I wonder whether it would help to put a PQconsumeInput call *before*
> the PQisBusy loop, so that any pre-existing EOF condition will be
> detected. If you don't like duplicating code, we could restructure
> the loop as
That does look a bit saner. Even leaving EOF and windows issues aside, it
seems weird to do a WaitLatchOrSocket() without having tried to read more
data.
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2022-02-13 02:00:44 | Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints |
Previous Message | Andres Freund | 2022-02-13 01:20:08 | Re: Adding CI to our tree |