From: | Peter Smith <smithpb2250(at)gmail(dot)com> |
---|---|
To: | Nisha Moond <nisha(dot)moond412(at)gmail(dot)com> |
Cc: | vignesh C <vignesh21(at)gmail(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Introduce XID age and inactive timeout based replication slot invalidation |
Date: | 2025-02-11 03:19:20 |
Message-ID: | CAHut+PvsvHWoiEkGTP4NfVNsADsy-Jan3Dvp+_GW3gmPDHf5Qw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi Nisha.
Some review comments about v74-0001
======
src/backend/replication/slot.c
1.
/* Maximum number of invalidation causes */
-#define RS_INVAL_MAX_CAUSES RS_INVAL_WAL_LEVEL
-
-StaticAssertDecl(lengthof(SlotInvalidationCauses) == (RS_INVAL_MAX_CAUSES + 1),
- "array length mismatch");
+#define RS_INVAL_MAX_CAUSES (lengthof(InvalidationCauses)-1)
The static assert was here to protect against dev mistakes in keeping
the lookup table up-to-date with the enum of slot.h. So it's not a
good idea to remove it...
IMO the RS_INVAL_MAX_CAUSES should be relocated to slot.h where the
enum is defined and where the devs know exactly how many invalidation
types there are. Then this static assert can be put back in to do its
job of ensuring the integrity properly again for this lookup table.
~~~
InvalidatePossiblyObsoleteSlot:
2.
+ if (possible_causes & RS_INVAL_IDLE_TIMEOUT)
+ {
+ /*
+ * Assign the current time here to avoid system call overhead
+ * while holding the spinlock in subsequent code.
+ */
+ now = GetCurrentTimestamp();
+ }
+
I felt that any minuscule benefit gained from having this conditional
'now' assignment is outweighed by the subsequent confusion/doubt
caused by passing around a 'now' to other functions where it may or
may not still be zero depending on different processing. IMO we should
just remove all doubts and always assign it so that "now always means
now".
~~~
3.
+ if (possible_causes & RS_INVAL_IDLE_TIMEOUT)
IMO every bitwise check like this should also be checking
(invalidation_cause == RS_INVAL_NONE). Maybe you omitted it here
because this is the first but I think it will be safer anyhow in case
the code gets shuffled around in future and the extra condition gets
overlooked.
~~~
4.
+ if (possible_causes & RS_INVAL_WAL_REMOVED)
+ {
+ if (initial_restart_lsn != InvalidXLogRecPtr &&
+ initial_restart_lsn < oldestLSN)
+ invalidation_cause = RS_INVAL_WAL_REMOVED;
+ }
+ if (invalidation_cause == RS_INVAL_NONE &&
+ (possible_causes & RS_INVAL_HORIZON))
+ {
+ if (SlotIsLogical(s) &&
+ /* invalid DB oid signals a shared relation */
+ (dboid == InvalidOid || dboid == s->data.database) &&
+ TransactionIdIsValid(initial_effective_xmin) &&
+ TransactionIdPrecedesOrEquals(initial_effective_xmin,
+ snapshotConflictHorizon))
+ invalidation_cause = RS_INVAL_HORIZON;
+ else if (TransactionIdIsValid(initial_catalog_effective_xmin) &&
+ TransactionIdPrecedesOrEquals(initial_catalog_effective_xmin,
+ snapshotConflictHorizon))
+ invalidation_cause = RS_INVAL_HORIZON;
+ }
+ if (invalidation_cause == RS_INVAL_NONE &&
I suggest adding blank lines where the bit conditions are to delineate
each of the different invalidation checks.
~~~
InvalidateObsoleteReplicationSlots:
5
- Assert(cause != RS_INVAL_HORIZON ||
TransactionIdIsValid(snapshotConflictHorizon));
- Assert(cause != RS_INVAL_WAL_REMOVED || oldestSegno > 0);
- Assert(cause != RS_INVAL_NONE);
+ Assert(!(possible_causes & RS_INVAL_HORIZON) ||
TransactionIdIsValid(snapshotConflictHorizon));
+ Assert(!(possible_causes & RS_INVAL_WAL_REMOVED) || oldestSegno > 0);
+ Assert(!(possible_causes & RS_INVAL_NONE));
AFAIK the RS_INVAL_NONE is defined as 0, so doing bit-wise operations
on 0 seems bogus.
Do you mean just Assert(possible_causes != NONE);
======
src/include/replication/slot.h
6.
-extern PGDLLIMPORT const char *const SlotInvalidationCauses[];
+typedef struct SlotInvalidationCauseMap
+{
+ int cause;
+ const char *cause_name;
+} SlotInvalidationCauseMap;
+
+extern PGDLLIMPORT const SlotInvalidationCauseMap InvalidationCauses[];
6a.
AFAIK. there is no longer any external access to this lookup table so
why do you need this extern. Similarly, why is this typedef even here
instead of declared in the slot.c module.
~
6b.
Why is the field 'cause' declared as int instead of
ReplicationSlotInvalidationCause?
======
Please the attached top-up patch as a code example of some of my
suggestions above -- in particular the relocating of
RS_INVAL_MAX_CAUSES and the typedef, and the reinstating of the static
insert for the lookup table.
======
Kind Regards,
Peter Smith.
Fujitsu Australia
Attachment | Content-Type | Size |
---|---|---|
PS_topup_for_v740001.txt | text/plain | 1.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Shubham Khanna | 2025-02-11 04:21:30 | Re: Adding a '--clean-publisher-objects' option to 'pg_createsubscriber' utility. |
Previous Message | James Hunter | 2025-02-11 03:12:50 | Re: Proposal: "query_work_mem" GUC, to distribute working memory to the query's individual operators |