From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Cc: | Amit Kapila <akapila(at)postgresql(dot)org>, pgsql-committers <pgsql-committers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: pgsql: Add a view to show the stats of subscription workers. |
Date: | 2021-11-30 09:31:17 |
Message-ID: | CAA4eK1LNHm+KrQi_JM4nuSGbMwOMf5=sHqgrso0PT1eY64SbTA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-committers |
On Tue, Nov 30, 2021 at 12:46 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Tue, Nov 30, 2021 at 3:04 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Tue, Nov 30, 2021 at 9:05 AM Amit Kapila <akapila(at)postgresql(dot)org> wrote:
> > >
> > > Add a view to show the stats of subscription workers.
> > >
> >
> > sidewinder is showing failure in the test added by this patch:
> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sidewinder&dt=2021-11-30%2005%3A05%3A25
> >
> > I'll investigate and share my findings.
>
> I've looked at the issue. Here is my analysis:
>
> Looking at the result the test actually got, we had two error entries
> for test_tab1 instead of one:
>
> # Failed test 'check the error reported by the apply worker'
> # at t/026_worker_stats.pl line 33.
> # got: 'tap_sub|INSERT|test_tab1|t
> # tap_sub||test_tab1|t'
> # expected: 'tap_sub|INSERT|test_tab1|t'
>
> The possible scenarios are:
>
> The table sync worker for test_tab1 failed due to an error unrelated
> to apply changes:
>
> 2021-11-30 06:24:02.137 CET [18990:2] ERROR: replication origin with
> OID 2 is already active for PID 23706
>
> At this time, the view had one error entry for the table sync worker.
> After retrying table sync, it succeeded:
>
> 2021-11-30 06:24:04.202 CET [28117:2] LOG: logical replication table
> synchronization worker for subscription "tap_sub", table "test_tab1"
> has finished
>
> Then after inserting a row on the publisher, the apply worker inserted
> the row but failed due to violating a unique key violation, which is
> expected:
>
> 2021-11-30 06:24:04.307 CET [4806:2] ERROR: duplicate key value
> violates unique constraint "test_tab1_pkey"
> 2021-11-30 06:24:04.307 CET [4806:3] DETAIL: Key (a)=(1) already exists.
> 2021-11-30 06:24:04.307 CET [4806:4] CONTEXT: processing remote data
> during "INSERT" for replication target relation "public.test_tab1" in
> transaction 721 at 2021-11-30 06:24:04.305096+01
>
> As a result, we had two error entries for test_tab1: the table sync
> worker error and the apply worker error. I didn't expect that the
> table sync worker for test_tab1 failed due to "replication origin with
> OID 2 is already active for PID 23706” error.
>
> Looking at test_subscription_error() in 026_worker_stats.pl, we have
> two checks; in the first check, we wait for the view to show the error
> entry for the given relation name and xid. This check was passed since
> we had the second error (i.g., apply worker error). In the second
> check, we get error entries from pg_stat_subscription_workers by
> specifying only the relation name. Therefore, we ended up getting two
> entries and failed the tests.
>
> To fix this issue, I think that in the second check, we can get the
> error from pg_stat_subscription_workers by specifying the relation
> name *and* xid like the first check does. I've attached the patch.
> What do you think?
>
I have responded to this on the original -hackers thread [1] as I
think the test case might need some more changes.
--
With Regards,
Amit Kapila.
From | Date | Subject | |
---|---|---|---|
Next Message | Daniel Gustafsson | 2021-11-30 10:50:16 | pgsql: Add TAP tests for contrib/sslinfo |
Previous Message | Amit Kapila | 2021-11-30 08:46:22 | Re: pgsql: Add a view to show the stats of subscription workers. |