Quick Links

Re: Race condition in recovery?

From:	Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, hlinnaka <hlinnaka(at)iki(dot)fi>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: Race condition in recovery?
Date:	2021-06-08 08:47:06
Message-ID:	CAFiTN-vWAseUExK=j-pBK2wR1phHOQ_Uc=0HND=p5SdNT+WC9w@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, Jun 8, 2021 at 11:13 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> # Wait until the node exits recovery.
> $standby->poll_query_until('postgres', "SELECT pg_is_in_recovery() = 'f';")
> or die "Timed out while waiting for promotion";
>
> I will try to generate a version for 9.6 based on this idea and see how it goes

I have changed for as per 9.6 but I am seeing some crash (both
with/without fix), I could not figure out the reason, it did not
generate any core dump, although I changed pg_ctl in PostgresNode.pm
to use "-c" so that it can generate core but it did not generate any
core file.

This is log from cascading node (025_stuck_on_old_timeline_cascade.log)
-------------
cp: cannot stat
‘/home/dilipkumar/work/PG/postgresql/src/test/recovery/tmp_check/data_primary_52dW/archives/000000010000000000000003’:
No such file or directory
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back
the current transaction and exit, because another server process
exited abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and
repeat your command.
FATAL: could not receive database system identifier and timeline ID
from the primary server: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
--------------

The attached logs are when I ran without a fix.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment	Content-Type	Size
9.6-v6-0001-Fix-corner-case-failure-of-new-standby-to-follow-.patch	text/x-patch	6.7 KB
025_stuck_on_old_timeline_cascade.log	text/x-log	3.2 KB
025_stuck_on_old_timeline_primary.log	text/x-log	708 bytes
025_stuck_on_old_timeline_standby.log	text/x-log	1.8 KB
regress_log_025_stuck_on_old_timeline	application/octet-stream	6.1 KB

In response to

Re: Race condition in recovery? at 2021-06-08 05:43:58 from Dilip Kumar

Responses

Re: Race condition in recovery? at 2021-06-08 16:26:02 from Robert Haas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	houzj.fnst@fujitsu.com	2021-06-08 09:12:31	RE: Parallel INSERT SELECT take 2
Previous Message	tsunakawa.takay@fujitsu.com	2021-06-08 08:45:24	RE: Transactions involving multiple postgres foreign servers, take 2