From: | shveta malik <shveta(dot)malik(at)gmail(dot)com> |
---|---|
To: | Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> |
Cc: | Peter Smith <smithpb2250(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com> |
Subject: | Re: Introduce XID age and inactive timeout based replication slot invalidation |
Date: | 2024-09-04 09:18:51 |
Message-ID: | CAJpy0uD21cPp9FkSCJi+NO47Y-6wmE+UW6i_K0HJCn=2UdMDmw@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Sep 4, 2024 at 9:17 AM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> On Tue, Sep 3, 2024 at 3:01 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> >
> >
1)
It is related to one of my previous comments (pt 3 in [1]) where I
stated that inactive_since should not keep on changing once a slot is
invalidated.
Below is one side effect if inactive_since keeps on changing:
postgres=# SELECT * FROM pg_replication_slot_advance('mysubnew1_1',
pg_current_wal_lsn());
ERROR: can no longer get changes from replication slot "mysubnew1_1"
DETAIL: The slot became invalid because it was inactive since
2024-09-04 10:03:56.68053+05:30, which is more than 10 seconds ago.
HINT: You might need to increase "replication_slot_inactive_timeout.".
postgres=# select now();
now
---------------------------------
2024-09-04 10:04:00.26564+05:30
'DETAIL' gives wrong information, we are not past 10-seconds. This is
because inactive_since got updated even in ERROR scenario.
2)
One more issue in this message is, once I set
replication_slot_inactive_timeout to a bigger value, it becomes more
misleading. This is because invalidation was done in the past using
previous value while message starts showing new value:
ALTER SYSTEM SET replication_slot_inactive_timeout TO '36h';
--see 129600 secs in DETAIL and the current time.
postgres=# SELECT * FROM pg_replication_slot_advance('mysubnew1_1',
pg_current_wal_lsn());
ERROR: can no longer get changes from replication slot "mysubnew1_1"
DETAIL: The slot became invalid because it was inactive since
2024-09-04 10:06:38.980939+05:30, which is more than 129600 seconds
ago.
postgres=# select now();
now
----------------------------------
2024-09-04 10:07:35.201894+05:30
I feel we should change this message itself.
~~~~~
When invalidation is due to wal_removed, we get a way simpler message:
newdb1=# SELECT * FROM pg_replication_slot_advance('mysubnew1_2',
pg_current_wal_lsn());
ERROR: replication slot "mysubnew1_2" cannot be advanced
DETAIL: This slot has never previously reserved WAL, or it has been
invalidated.
This message does not mention 'max_slot_wal_keep_size'. We should have
a similar message for our case. Thoughts?
thanks
Shveta
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2024-09-04 09:28:24 | Re: [PoC] Federated Authn/z with OAUTHBEARER |
Previous Message | jian he | 2024-09-04 08:57:00 | Re: Add memory/disk usage for WindowAgg nodes in EXPLAIN |