Re: Introduce XID age and inactive timeout based replication slot invalidation

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: Álvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Introduce XID age and inactive timeout based replication slot invalidation
Date: 2025-02-12 03:55:53
Message-ID: CAA4eK1KmCn_jCTEJSS5FVfJrQzFi2QaroVfgD0VMw_v9i3TGyg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Feb 11, 2025 at 9:39 PM Nathan Bossart <nathandbossart(at)gmail(dot)com> wrote:
>
> On Tue, Feb 11, 2025 at 03:22:49PM +0100, Álvaro Herrera wrote:
> > I find this proposed patch a bit strange and I feel it needs more
> > explanation.
> >
> > When this thread started, Bharath justified his patches saying that a
> > slot that's inactive for a very long time could be problematic because
> > of XID wraparound. Fine, that sounds a reasonable feature. If you
> > wanted to invalidate slots whose xmins were too old, I would support
> > that. He submitted that as his 0004 patch then.
> >
> > However, he also chose to submit 0003 with invalidation based on a
> > timeout. This is far less convincing a feature to me. The
> > justification for the time out seems to be that ... it's difficult to
> > have a one-size-fits-all value because size of disks vary. (???)
> > Or something like that. Really? I mean -- yes, this will prevent
> > problems in toy databases when run in developer's laptops. It will not
> > prevent any problems in production databases. Do we really want a
> > setting that is only useful for toy situations rather than production?
> >
> >
...
> >
> > I'm baffled.
>
> I agree, and I am also baffled because I think this discussion has happened
> at least once already on this thread.
>

Yes, we previously discussed this topic and Robert seems to prefer a
time-based parameter for invalidating the slot (1)(2) as it is easier
to reason in terms of time. The other points discussed previously were
that there are tools that create a lot of slots and sometimes forget
to clean up slots. Bharath has seen this in production and we now have
the tool pg_createsubscriber that creates a slot-per-database, so if
for some reason, such slots are not cleaned on the tool's exit, such a
parameter could save the cluster. See (3)(4).

Also, we previously didn't have a good experience with XID-based
threshold parameters like vacuum_defer_cleanup_age as mentioned by
Robert (1). AFAICU from the previous discussion we need a time-based
parameter and we didn't rule out xid_age based parameter as another
parameter.

(1) - https://www.postgresql.org/message-id/CA%2BTgmoZTbaaEjSZUG1FL0mzxAdN3qmXksO3O9_PZhEuXTkVnRQ%40mail.gmail.com
(2) - https://www.postgresql.org/message-id/CA%2BTgmoaRECcnyqxAxUhP5dk2S4HX%3DpGh-p-PkA3uc%2BjG_9hiMw%40mail.gmail.com
(3) - https://www.postgresql.org/message-id/CALj2ACVFV%3DyUa3DXXfJLOtJxUM8qzC_mEECMJ2iekDGPeQLkTw%40mail.gmail.com
(4) - https://www.postgresql.org/message-id/CAA4eK1L3awyzWMuymLJUm8SoFEQe%3DDa9KUwCcAfC31RNJ1xdJA%40mail.gmail.com

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Euler Taveira 2025-02-12 04:31:37 Re: Support POSITION with nondeterministic collations
Previous Message Michael Paquier 2025-02-12 03:53:51 Re: Small memory fixes for pg_createsubcriber