| From: | Dilip Kumar <dilipbalaut(at)gmail(dot)com> |
|---|---|
| To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
| Cc: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, hlinnaka <hlinnaka(at)iki(dot)fi>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Race condition in recovery? |
| Date: | 2021-06-08 08:47:06 |
| Message-ID: | CAFiTN-vWAseUExK=j-pBK2wR1phHOQ_Uc=0HND=p5SdNT+WC9w@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Tue, Jun 8, 2021 at 11:13 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> # Wait until the node exits recovery.
> $standby->poll_query_until('postgres', "SELECT pg_is_in_recovery() = 'f';")
> or die "Timed out while waiting for promotion";
>
> I will try to generate a version for 9.6 based on this idea and see how it goes
I have changed for as per 9.6 but I am seeing some crash (both
with/without fix), I could not figure out the reason, it did not
generate any core dump, although I changed pg_ctl in PostgresNode.pm
to use "-c" so that it can generate core but it did not generate any
core file.
This is log from cascading node (025_stuck_on_old_timeline_cascade.log)
-------------
cp: cannot stat
‘/home/dilipkumar/work/PG/postgresql/src/test/recovery/tmp_check/data_primary_52dW/archives/000000010000000000000003’:
No such file or directory
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back
the current transaction and exit, because another server process
exited abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and
repeat your command.
FATAL: could not receive database system identifier and timeline ID
from the primary server: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
--------------
The attached logs are when I ran without a fix.
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
| Attachment | Content-Type | Size |
|---|---|---|
| 9.6-v6-0001-Fix-corner-case-failure-of-new-standby-to-follow-.patch | text/x-patch | 6.7 KB |
| 025_stuck_on_old_timeline_cascade.log | text/x-log | 3.2 KB |
| 025_stuck_on_old_timeline_primary.log | text/x-log | 708 bytes |
| 025_stuck_on_old_timeline_standby.log | text/x-log | 1.8 KB |
| regress_log_025_stuck_on_old_timeline | application/octet-stream | 6.1 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | houzj.fnst@fujitsu.com | 2021-06-08 09:12:31 | RE: Parallel INSERT SELECT take 2 |
| Previous Message | tsunakawa.takay@fujitsu.com | 2021-06-08 08:45:24 | RE: Transactions involving multiple postgres foreign servers, take 2 |