Re: pg_replication_origin_drop API potential race condition

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Petr Jelinek <pjmodos(at)pjmodos(dot)net>, Euler Taveira <euler(at)eulerto(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pg_replication_origin_drop API potential race condition
Date: 2021-02-09 05:28:53
Message-ID: CAA4eK1JxnWQ6tf8qJ1gHusVbaoBDp_rL-CXYMs2ue2zgawoU0Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Feb 9, 2021 at 9:27 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Mon, Feb 8, 2021 at 11:30 PM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
> >
> > > +void
> > > +replorigin_drop_by_name(char *name, bool missing_ok, bool nowait)
> > > +{
> > > + RepOriginId roident;
> > > + Relation rel;
> > > +
> > > + Assert(IsTransactionState());
> > > +
> > > + /*
> > > + * To interlock against concurrent drops, we hold ExclusiveLock on
> > > + * pg_replication_origin throughout this function.
> > > + */
> >
> > This comment is now wrong though; should s/throughout.*/till xact commit/
> > to reflect the new reality.
> >
>
> Right, I'll fix in the next version.
>

Fixed in the attached.

> > I do wonder if this is going to be painful in some way, since the lock
> > is now going to be much longer-lived. My impression is that it's okay,
> > since dropping an origin is not a very frequent occurrence. It is going
> > to block pg_replication_origin_advance() with *any* origin, which
> > acquires RowExclusiveLock on the same relation. If this is a problem,
> > then we could use LockSharedObject() in both places (and make it last
> > till end of xact for the case of deletion), instead of holding this
> > catalog-level lock till end of transaction.
> >
>
> IIUC, you are suggesting to use lock for the particular origin instead
> of locking the corresponding catalog table in functions
> pg_replication_origin_advance and replorigin_drop_by_name. If so, I
> don't see any problem with the same
>

I think it won't be that straightforward as we don't have origin_id.
So what we instead need to do is first to acquire a lock on
ReplicationOriginRelationId, get the origin_id, lock the specific
origin and then re-check if the origin still exists. I feel some
similar changes might be required in pg_replication_origin_advance.
Now, we can do this optimization if we want but I am not sure if
origin_drop would be a frequent enough operation that we add such an
optimization. For now, I have added a note in the comments so that if
we find any such use case we can implement such optimization in the
future. What do you think?

--
With Regards,
Amit Kapila.

Attachment Content-Type Size
v6-0001-Make-pg_replication_origin_drop-safe-against-conc.patch application/octet-stream 5.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2021-02-09 05:55:23 Re: Is Recovery actually paused?
Previous Message Michael Paquier 2021-02-09 05:18:49 Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM