From: | Noah Misch <noah(at)leadboat(dot)com> |
---|---|
To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Euler Taveira <euler(at)eulerto(dot)com>, "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Alexey Lesovsky <lesovsky(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Skipping logical replication transactions on subscriber side |
Date: | 2022-04-02 00:11:53 |
Message-ID: | 20220402001153.GA3719101@rfd.leadboat.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Apr 01, 2022 at 09:25:52PM +0900, Masahiko Sawada wrote:
> > On Fri, Apr 1, 2022 at 4:44 PM Noah Misch <noah(at)leadboat(dot)com> wrote:
> > > src/test/subscription/t/029_on_error.pl has been failing reliably on the five
> > > AIX buildfarm members:
> > >
> > > # poll_query_until timed out executing this query:
> > > # SELECT subskiplsn = '0/0' FROM pg_subscription WHERE subname = 'sub'
> > > # expecting this output:
> > > # t
> > > # last actual query output:
> > > # f
> > > # with stderr:
> > > timed out waiting for match: (?^:LOG: done skipping logical replication transaction finished at 0/1D30788) at t/029_on_error.pl line 50.
> > >
> > > I've posted five sets of logs (2.7 MiB compressed) here:
> > > https://drive.google.com/file/d/16NkyNIV07o0o8WM7GwcaAYFQDPTkULkR/view?usp=sharing
> Given that "SELECT subskiplsn = '0/0'
> FROM pg_subscription WHERE subname = 'sub’” didn't return true, some
> value was set to subskiplsn even after the unique key error.
>
> So I'm guessing that the apply worker could not get the updated value
> of the subskiplsn or its MySubscription->skiplsn could not match with
> the transaction's finish LSN. Also, given that the test is failing on
> all AIX buildfarm members, there might be something specific to AIX.
>
> Noah, to investigate this issue further, is it possible for you to
> apply the attached patch and run the 029_on_error.pl test? The patch
> adds some logs to get additional information.
Logs attached. I ran this outside the buildfarm script environment. Most
notably, I didn't override PG_TEST_TIMEOUT_DEFAULT like my buildfarm
configuration does, so the total log size is smaller.
Attachment | Content-Type | Size |
---|---|---|
log-subscription-20220401.tar.xz | application/octet-stream | 106.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2022-04-02 00:17:11 | Re: logical decoding and replication of sequences |
Previous Message | Tom Lane | 2022-04-02 00:06:50 | Re: Fix overflow in DecodeInterval |