From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | Peter Smith <smithpb2250(at)gmail(dot)com> |
Cc: | Craig Ringer <craig(dot)ringer(at)enterprisedb(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Petr Jelinek <petr(dot)jelinek(at)enterprisedb(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com> |
Subject: | Re: Single transaction in the tablesync worker? |
Date: | 2021-01-06 05:06:11 |
Message-ID: | CAA4eK1Kv8D8hvN-77Jvk6=FuyPz7H-wbdjCmSyVnhdEqotSHAw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Jan 5, 2021 at 3:32 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> On Mon, Jan 4, 2021 at 10:48 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
>
> > 5. Is it possible to write a testcase where we fail (say due to pk
> > violation or some other error) after the initial copy is done, then
> > remove the conflicting row and allow a copy to be completed? If we
> > already have any such test then it is fine.
> >
>
> Causing a PK violation during the initial copy is not a problem to
> test, but doing it after the initial copy is difficult. I have done
> exactly this test scenario before but I thought it cannot be
> automated. E.g. To cause an PK violation error somewhere between
> COPYDONE and SYNDONE means that the offending insert (the one which
> tablesync will fail to replicate) has to be sent while the tablesync
> is in CATCHUP mode. But AFAIK that can only be achieved using the
> debugger to get the timing right.
>
Yeah, I am also not able to think of any way to automate such a test.
I was thinking about what could go wrong if we error out in that
stage. The only thing that could be problematic is if we somehow make
the slot and replication origin used during copy dangling. I think if
tablesync is restarted after error then we will clean up those which
will be normally the case but what if the tablesync worker is not
started again? I think the only possibility of tablesync worker not
started again is if during Alter Subscription ... Refresh Publication,
we remove the corresponding subscription rel (see
AlterSubscription_refresh, I guess it could happen if one has dropped
the relation from publication). I haven't tested this with your patch
but if such a possibility exists then we need to think of cleaning up
slot and origin when we remove subscription rel. What do you think?
--
With Regards,
Amit Kapila.
From | Date | Subject | |
---|---|---|---|
Next Message | Hou, Zhijie | 2021-01-06 05:36:26 | RE: Parallel Inserts in CREATE TABLE AS |
Previous Message | Dilip Kumar | 2021-01-06 04:47:29 | Re: Parallel Inserts in CREATE TABLE AS |