Re: Missing LWLock protection in pgstat_reset_replslot()

From: shveta malik <shveta(dot)malik(at)gmail(dot)com>
To: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers(at)lists(dot)postgresql(dot)org, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: Missing LWLock protection in pgstat_reset_replslot()
Date: 2024-03-07 05:27:28
Message-ID: CAJpy0uBxxS5Yw3TmY=pCk3mwdOsBLF0JX595wbVEhzFJ5NvefA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 6, 2024 at 2:36 PM Bertrand Drouvot
<bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
>
> Hi,
>
> On Wed, Mar 06, 2024 at 10:24:46AM +0530, shveta malik wrote:
> > On Tue, Mar 5, 2024 at 6:52 PM Bertrand Drouvot
> > <bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
> > Thanks. Can we try to get rid of multiple LwLockRelease in
> > pgstat_reset_replslot(). Is this any better?
> >
> > /*
> > - * Nothing to do for physical slots as we collect stats only for logical
> > - * slots.
> > + * Reset stats if it is a logical slot. Nothing to do for physical slots
> > + * as we collect stats only for logical slots.
> > */
> > - if (SlotIsPhysical(slot))
> > - {
> > - LWLockRelease(ReplicationSlotControlLock);
> > - return;
> > - }
> > -
> > - /* reset this one entry */
> > - pgstat_reset(PGSTAT_KIND_REPLSLOT, InvalidOid,
> > - ReplicationSlotIndex(slot));
> > + if (SlotIsLogical(slot))
> > + pgstat_reset(PGSTAT_KIND_REPLSLOT, InvalidOid,
> > + ReplicationSlotIndex(slot));
> >
> > LWLockRelease(ReplicationSlotControlLock);
> >
>
> Yeah, it's easier to read and probably reduce the pgstat_replslot.o object file
> size a bit for non optimized build.
>
> > Something similar in pgstat_fetch_replslot() perhaps?
>
> Yeah, all of the above done in v3 attached.
>

Thanks for the patch.

For the fix in pgstat_fetch_replslot(), even with the lock in fetch
call, there are chances that the concerned slot can be dropped and
recreated.

--It can happen in a small window in pg_stat_get_replication_slot()
when we are consuming the return values of pgstat_fetch_replslot
(using slotent).

--Also it can happen at a later stage when we have switched to
fetching the next slot (obtained from 'pg_replication_slots' through
view' pg_stat_replication_slots'), the previous one can be dropped.

Ultimately the results of system view 'pg_replication_slots' can still
give us non-existing or re-created slots. But yes, I do not deny that
it gives us better consistency protection.

Do you feel that the lock in pgstat_fetch_replslot() should be moved
to its caller when we are done copying the results of that slot? This
will improve the protection slightly.

thanks
Shveta

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2024-03-07 05:41:49 Re: Missing LWLock protection in pgstat_reset_replslot()
Previous Message torikoshia 2024-03-07 05:21:13 Re: Add new error_action COPY ON_ERROR "log"