Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS
Date: 2021-05-20 00:02:14
Message-ID: YKWnBsH0uHMe6Ix+@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, May 19, 2021 at 02:36:03PM -0400, Andrew Dunstan wrote:
> Yeah, this area needs substantial improvement. I have seen similar sorts
> of nasty hangs, where the script is waiting forever for some process
> that hasn't got the shutdown message. At least we probably need some way
> of making sure the END handler doesn't abort early. Maybe
> PostgresNode::stop() needs a mode that handles failure more gracefully.
> Maybe it needs to try shutting down all the nodes and only calling
> BAIL_OUT after trying all of them and getting a failure. But that might
> still leave us work to do on failures occuring pre-END.

For that, we could just make the END block called run_log() directly
as well, as this catches stderr and an error code. What about making
the shutdown a two-phase logic by the way? Trigger an immediate stop,
and if it fails fallback to an extra kill9() to be on the safe side.

Have you seen this being a problem even in cases where the tests all
passed? If yes, it may be worth using the more aggressive flow even
in the case where the tests pass.
--
Michael

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message tsunakawa.takay@fujitsu.com 2021-05-20 00:20:16 RE: Skip partition tuple routing with constant partition key
Previous Message Michael Paquier 2021-05-19 23:56:09 Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS