From: | Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Introduce XID age and inactive timeout based replication slot invalidation |
Date: | 2024-03-29 12:47:51 |
Message-ID: | Zga4dxUcqLXBtNcf@ip-10-97-1-34.eu-west-3.compute.internal |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On Fri, Mar 29, 2024 at 03:03:01PM +0530, Amit Kapila wrote:
> On Fri, Mar 29, 2024 at 11:49 AM Bertrand Drouvot
> <bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
> >
> > On Fri, Mar 29, 2024 at 09:39:31AM +0530, Amit Kapila wrote:
> > >
> > > Commit message states: "why we can't just update inactive_since for
> > > synced slots on the standby with the value received from remote slot
> > > on the primary. This is consistent with any other slot parameter i.e.
> > > all of them are synced from the primary."
> > >
> > > The inactive_since is not consistent with other slot parameters which
> > > we copy. We don't perform anything related to those other parameters
> > > like say two_phase phase which can change that property. However, we
> > > do acquire the slot, advance the slot (as per recent discussion [1]),
> > > and release it. Since these operations can impact inactive_since, it
> > > seems to me that inactive_since is not the same as other parameters.
> > > It can have a different value than the primary. Why would anyone want
> > > to know the value of inactive_since from primary after the standby is
> > > promoted?
> >
> > I think it can be useful "before" it is promoted and in case the primary is down.
> >
>
> It is not clear to me what is user going to do by checking the
> inactivity time for slots when the corresponding server is down.
Say a failover needs to be done, then it could be useful to know for which
slots the activity needs to be resumed (thinking about external logical decoding
plugin, not about pub/sub here). If one see an inactive slot (since long "enough")
then he can start to reasonate about what to do with it.
> I thought the idea was to check such slots and see if they need to be
> dropped or enabled again to avoid excessive disk usage, etc.
Yeah that's the case but it does not mean inactive_since can't be useful in other
ways.
Also, say the slot has been invalidated on the primary (due to inactivity timeout),
primary is down and there is a failover. By keeping the inactive_since from
the primary, one could know when the inactivity that lead to the timeout started.
Again, more concerned about external logical decoding plugin than pub/sub here.
> > I agree that tracking the activity time of a synced slot can be useful, why
> > not creating a dedicated field for that purpose (and keep inactive_since a
> > perfect "copy" of the primary)?
> >
>
> We can have a separate field for this but not sure if it is worth it.
OTOH I'm not sure that erasing this information from the primary is useful. I
think that 2 fields would be the best option and would be less subject of
misinterpretation.
> > > Now, the other concern is that calling GetCurrentTimestamp()
> > > could be costly when the values for the slot are not going to be
> > > updated but if that happens we can optimize such that before acquiring
> > > the slot we can have some minimal pre-checks to ensure whether we need
> > > to update the slot or not.
> >
> > Right, but for a very active slot it is likely that we call GetCurrentTimestamp()
> > during almost each sync cycle.
> >
>
> True, but if we have to save a slot to disk each time to persist the
> changes (for an active slot) then probably GetCurrentTimestamp()
> shouldn't be costly enough to matter.
Right, persisting the changes to disk would be even more costly.
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | Laurenz Albe | 2024-03-29 13:07:06 | Re: psql's FETCH_COUNT (cursor) is not being respected for CTEs |
Previous Message | Robert Haas | 2024-03-29 12:46:33 | Re: Possibility to disable `ALTER SYSTEM` |