Re: Replication slot stats misgivings

From: Andres Freund <andres(at)anarazel(dot)de>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Subject: Re: Replication slot stats misgivings
Date: 2021-03-21 21:40:11
Message-ID: 20210321214011.ftxuq6egzw77fnpj@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2021-03-21 16:08:00 +0530, Amit Kapila wrote:
> On Sun, Mar 21, 2021 at 2:57 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > On 2021-03-20 10:28:06 +0530, Amit Kapila wrote:
> > > On Sat, Mar 20, 2021 at 9:25 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > > This idea is worth exploring to address the complaints but what do we
> > > > do when we detect that the stats are from the different slot? It has
> > > > mixed of stats from the old and new slot. We need to probably reset it
> > > > after we detect that.
> > > >
> > >
> > > What if the user created a slot with the same name after dropping the
> > > slot and it has used the same index. I think chances are less but
> > > still a possibility, but maybe that is okay.
> > >
> > > > What if after some frequency (say whenever we
> > > > run out of indexes) we check whether the slots we are maintaining is
> > > > pgstat.c have some stale slot entry (entry exists but the actual slot
> > > > is dropped)?
> > > >
> > >
> > > A similar drawback (the user created a slot with the same name after
> > > dropping it) exists with this as well.
> >
> > pgstat_report_replslot_drop() already prevents that, no?
> >
>
> Yeah, normally it would prevent that but what if a drop message is lost?

That already exists as a danger, no? pgstat_recv_replslot() uses
pgstat_replslot_index() to find the slot by name. So if a drop message
is lost we'd potentially accumulate into stats of an older slot. It'd
probably a lower risk with what I suggested, because the initial stat
report slot.c would use something like pgstat_report_replslot_create(),
which the stats collector can use to reset the stats to 0?

If we do it right the lossiness will be removed via shared memory stats
patch... But architecturally the name based lookup and unpredictable
number of stats doesn't fit in super well.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2021-03-21 21:49:50 Re: replication cleanup code incorrect way to use of HTAB HASH_REMOVE ?
Previous Message Tom Lane 2021-03-21 21:35:09 Re: [HACKERS] Custom compression methods