From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org, mikael(dot)kjellstrom(at)gmail(dot)com |
Subject: | Re: Race condition in crash-recovery tests |
Date: | 2019-01-27 02:29:37 |
Message-ID: | 20190127022937.nvocrvsok7nlp4vt@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2019-01-26 20:53:48 -0500, Tom Lane wrote:
> Recently, buildfarm member curculio has started to show a semi-repeatable
> failure in src/test/recovery/t/013_crash_restart.pl:
>
> # aborting wait: program died
> # stream contents: >>psql:<stdin>:8: no connection to the server
> # psql:<stdin>:8: connection to server was lost
> # <<
> # pattern searched for: (?^m:server closed the connection unexpectedly)
>
> # Failed test 'psql query died successfully after SIGKILL'
> # at t/013_crash_restart.pl line 198.
>
> The message this test is looking for is what libpq reports upon getting
> EOF or ECONNRESET from a socket read attempt. The message it's actually
> seeing is what libpq reports if it notices that the PQconn is *already*
> in CONNECTION_BAD state when it's trying to send a new query.
>
> I have no idea why we're seeing this in only one buildfarm member
> and only for the past week or so, as it doesn't appear that any
> related code has changed for months. (Perhaps something changed
> about curculio's host?)
I have no idea why it's just curculio, but I think I know why it only
started recently: Curculio doesn't appear to have tap tests enabled
before
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=curculio&dt=2019-01-17%2021%3A30%3A02
> just change the test script to accept either message as a successful
> result. I think that 4247db625 made such races more likely, but I
> don't believe it was impossible before.
Sounds right to me - do you want to do the honors or shall I?
> Another idea is to change libpq so that both these cases emit identical
> messages, but I don't really feel that that'd be an improvement. Also,
> since 4247db625 was back-patched, we'd have to back-patch the message
> change as well, which I like even less. People might be relying on
> seeing either message spelling in some situations.
Yea, I don't think that's the way to go.
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2019-01-27 02:32:53 | Re: Variable-length FunctionCallInfoData |
Previous Message | Tom Lane | 2019-01-27 01:53:48 | Race condition in crash-recovery tests |