Buildfarm failure on tamandua - "timed out waiting for subscriber to synchronize data"

From: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Buildfarm failure on tamandua - "timed out waiting for subscriber to synchronize data"
Date: 2024-03-21 09:47:12
Message-ID: OS0PR01MB5716DB29841070E0DABCB9E594322@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

There is a failure in 040_standby_failover_slots_sync on tamandua:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=tamandua&dt=2024-03-21%2008%3A28%3A58

The reason of the timeout is that the table sync worker for the new table is not started
after executing ALTER SUBSCRIPTION REFRESH PUBLICATION.

We have seen similar failures in other tests as well[1]. AFAICS, the reasons of
them are the same which is because of a race condition in logicalrep apply worker. The
analysis has been posted on another thread[2] and the fix is also being
reviewed.

[1] https://www.postgresql.org/message-id/OSZPR01MB6310D6F48372F52F1D85E1C5FD609%40OSZPR01MB6310.jpnprd01.prod.outlook.com
[2] https://www.postgresql.org/message-id/flat/CALDaNm1XeB3bF%2BVEJZi%3DBT31PZAL_UVys-26%2BYSv_AxCq0G2eg%40mail.gmail.com#87b153fd7676652746406a6f114eb67b

Best Regards,
Hou Zhijie

Browse pgsql-hackers by date

  From Date Subject
Next Message Shlok Kyal 2024-03-21 09:49:16 Re: speed up a logical replica setup
Previous Message Jelte Fennema-Nio 2024-03-21 09:44:17 Re: Flushing large data immediately in pqcomm