Re: Slow catchup of 2PC (twophase) transactions on replica in LR

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Smith <smithpb2250(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Vitaly Davydov <v(dot)davydov(at)postgrespro(dot)ru>, Ajin Cherian <itsajin(at)gmail(dot)com>
Subject: Re: Slow catchup of 2PC (twophase) transactions on replica in LR
Date: 2024-07-30 10:32:06
Message-ID: CALDaNm3YY+bzj+JWJbY+DsUgJ2mPk8OR1ttjVX2cywKr4BUgxw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 25 Jul 2024 at 08:39, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Wed, Jul 24, 2024 at 9:13 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >
> > Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> writes:
> > > I merged these changes, made a few other cosmetic changes, and pushed the patch.
> >
> > There is a CF entry pointing at this thread [1]. Should it be closed?
> >
>
> Yes, closed now. Thanks for the reminder.

I noticed one random test failure in my environment with 021_twophase test.
[10:37:01.131](0.053s) ok 24 - should be no prepared transactions on subscriber
error running SQL: 'psql:<stdin>:2: ERROR: cannot alter two_phase
when logical replication worker is still running
HINT: Try again after some time.'

We can reproduce the issue by adding a delay at apply_worker_exit like
in the attached Reproduce_random_021_twophase_test_failure.patch
patch.

This is happening because the check here is wrong:
+$node_subscriber->poll_query_until('postgres',
+ "SELECT count(*) = 0 FROM pg_stat_activity WHERE backend_type =
'logical replication worker'"

Here "logical replication worker" should be "logical replication apply worker".

Attached patch has the changes for the same.

Regards,
Vignesh

Attachment Content-Type Size
0001-Fix-random-failure-in-021_twophase.patch text/x-patch 1.6 KB
Reproduce_random_021_twophase_test_failure.patch text/x-patch 411 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2024-07-30 10:33:42 Re: Conflict Detection and Resolution
Previous Message Andrew Dunstan 2024-07-30 10:29:45 Re: xid_wraparound tests intermittent failure.