From: | "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | RE: Deadlock between logrep apply worker and tablesync worker |
Date: | 2023-01-28 04:06:26 |
Message-ID: | OS3PR01MB5718F0BC9F89DEA388FB258494CD9@OS3PR01MB5718.jpnprd01.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Friday, January 27, 2023 8:16 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
>
> On Fri, Jan 27, 2023 at 3:45 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> >
> > On Mon, 23 Jan 2023 at 10:52, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> wrote:
> > >
> > > IIRC, this is done to prevent concurrent drops of origin drop say by
> > > exposed API pg_replication_origin_drop(). See the discussion in [1]
> > > related to it. If we want we can optimize it so that we can acquire
> > > the lock on the specific origin as mentioned in comments
> > > replorigin_drop_by_name() but it was not clear that this operation
> > > would be frequent enough.
> >
> > Here is an attached patch to lock the replication origin record using
> > LockSharedObject instead of locking pg_replication_origin relation in
> > ExclusiveLock mode. Now tablesync worker will wait only if the
> > tablesync worker is trying to drop the same replication origin which
> > has already been dropped by the apply worker, the other tablesync
> > workers will be able to successfully drop the replication origin
> > without any wait.
> >
>
> There is a code in the function replorigin_drop_guts() that uses the
> functionality introduced by replorigin_exists(). Can we reuse this function for
> the same?
Maybe we can use SearchSysCacheExists1 to check the existence instead of
adding a new function.
One comment about the patch.
@@ -430,23 +445,21 @@ replorigin_drop_by_name(const char *name, bool missing_ok, bool nowait)
...
+ /* Drop the replication origin if it has not been dropped already */
+ if (replorigin_exists(roident))
replorigin_drop_guts(rel, roident, nowait);
If developer pass missing_ok as false, should we report an ERROR here
instead of silently return ?
Best Regards,
Hou zj
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2023-01-28 04:11:38 | Re: Something is wrong with wal_compression |
Previous Message | Andrey Borodin | 2023-01-28 03:57:35 | Re: Something is wrong with wal_compression |