From: | Nisha Moond <nisha(dot)moond412(at)gmail(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | Peter Smith <smithpb2250(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Introduce XID age and inactive timeout based replication slot invalidation |
Date: | 2025-01-31 12:20:26 |
Message-ID: | CABdArM7D9u1an6gzfArAL32Jn0QQkKs7JffUxcZ9EqzAaGrfvQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Jan 31, 2025 at 2:32 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Fri, Jan 31, 2025 at 10:40 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> >
> > ======
> > src/backend/replication/slot.c
> >
> > ReportSlotInvalidation:
> >
> > 1.
> > +
> > + case RS_INVAL_IDLE_TIMEOUT:
> > + Assert(inactive_since > 0);
> > + /* translator: second %s is a GUC variable name */
> > + appendStringInfo(&err_detail,
> > + _("The slot has remained idle since %s, which is longer than the
> > configured \"%s\" duration."),
> > + timestamptz_to_str(inactive_since),
> > + "idle_replication_slot_timeout");
> > + break;
> > +
> >
> > errdetail:
> >
> > I guess it is no fault of this patch because I see you've only copied
> > nearby code, but AFAICT this function is still having an each-way bet
> > by using a mixture of _() macro which is for strings intended be
> > translated, but then only using them in errdetail_internal() which is
> > for strings that are NOT intended to be translated. Isn't it
> > contradictory? Why don't we use errdetail() here?
> >
>
> Your question is valid and I don't have an answer. I encourage you to
> start a new thread to clarify this.
>
> > errhint:
> >
> > Also, the way the 'hint' is implemented can only be meaningful for
> > RS_INVAL_WAL_REMOVED. This is also existing code that IMO it was
> > always strange, but now that this patch has added another kind of
> > switch (cause) this hint implementation now looks increasingly hacky
> > to me; it is also inflexible -- e.g. if you ever wanted to add
> > different hints. A neater implementation would be to make the code
> > more like how the err_detail is handled, so then the errhint string
> > would only be assigned within the "case RS_INVAL_WAL_REMOVED:"
> >
>
> This makes sense to me.
>
> +
> + case RS_INVAL_IDLE_TIMEOUT:
> + Assert(inactive_since > 0);
> + /* translator: second %s is a GUC variable name */
> + appendStringInfo(&err_detail,
> + _("The slot has remained idle since %s, which is longer than the
> configured \"%s\" duration."),
> + timestamptz_to_str(inactive_since),
> + "idle_replication_slot_timeout");
>
> I think the above message should be constructed on a model similar to
> the following nearby message:"The slot's restart_lsn %X/%X exceeds the
> limit by %llu bytes.". So, how about the following: "The slot's idle
> time %s exceeds the configured \"%s\" duration"?
>
> Also, similar to max_slot_wal_keep_size, we should give a hint in this
> case to increase idle_replication_slot_timeout.
>
> It is not clear why the injection point test is doing
> pg_sync_replication_slots() etc. in the patch. The test should be
> simple such that after creating a new physical or logical slot, enable
> the injection point, then run the manual checkpoint command, and check
> the invalidation status of the slot.
>
Thanks for the review! I have incorporated the above comments. The
test in patch-002 has been optimized as suggested and now completes in
less than a second.
Please find the attached v66 patch set. The base patch(v65-001) is
committed now, so I have rebased the patches.
Thank you, Kuroda-san, for working on patch-002.
--
Thanks,
Nisha
Attachment | Content-Type | Size |
---|---|---|
v66-0001-Introduce-inactive_timeout-based-replication-slo.patch | application/octet-stream | 24.9 KB |
v66-0002-Add-TAP-test-for-slot-invalidation-based-on-inac.patch | application/octet-stream | 5.7 KB |
v66-0003-Add-TAP-test-for-slot-invalidation-based-on-inac.patch | application/octet-stream | 10.1 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Srinath Reddy | 2025-01-31 12:24:09 | Re: hash_search_with_hash_value is high in "perf top" on a replica |
Previous Message | Alvaro Herrera | 2025-01-31 12:16:55 | Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning |