Re: Introduce XID age and inactive timeout based replication slot invalidation

From: Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Introduce XID age and inactive timeout based replication slot invalidation
Date: 2024-11-27 10:54:49
Message-ID: CABdArM4QR7LPoJ_ES+_1miQZ9ziwsOUun4FGSDhPdrxsugH_0A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Nov 27, 2024 at 8:39 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> Hi Nisha,
>
> Here are some review comments for the patch v50-0002.
>
> ======
> src/backend/replication/slot.c
>
> InvalidatePossiblyObsoleteSlot:
>
> 1.
> + if (now &&
> + TimestampDifferenceExceeds(s->inactive_since, now,
> + replication_slot_inactive_timeout_sec * 1000))
>
> Previously this was using an additional call to SlotInactiveTimeoutCheckAllowed:
>
> + if (SlotInactiveTimeoutCheckAllowed(s) &&
> + TimestampDifferenceExceeds(s->inactive_since, now,
> + replication_slot_inactive_timeout * 1000))
>
> Is it OK to skip that call? e.g. can the slot fields possibly change
> between assigning the 'now' and acquiring the mutex? If not, then the
> current code is fine. The only reason for asking is because it is
> slightly suspicious that it was not done this "easy" way in the first
> place.
>
Good catch! While the mutex was being acquired right after the now
assignment, there was a rare chance of another process modifying the
slot in the meantime. So, I reverted the change in v51. To optimize
the SlotInactiveTimeoutCheckAllowed() call, it's sufficient to check
it here instead of during the 'now' assignment.

Attached v51 patch-set addressing all comments in [1] and [2].

[1] https://www.postgresql.org/message-id/CAHut%2BPtuiQj1hwm%3D73xJ8hWuw-9cXbN4dHJHpM6EXxubDJgmFA%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAHut%2BPvi-g%2B9%2Bhjmjg44OzTN9L3YGQiCXBDAVaTVWvSn5SSwmw%40mail.gmail.com

--
Thanks,
Nisha

Attachment Content-Type Size
v51-0001-Add-error-handling-while-acquiring-a-replication.patch application/octet-stream 7.9 KB
v51-0002-Introduce-inactive_timeout-based-replication-slo.patch application/octet-stream 27.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2024-11-27 10:56:09 Re: Conflict detection for update_deleted in logical replication
Previous Message yuansong 2024-11-27 10:53:20 Re:Re:Re: backup server core when redo btree_xlog_insert that type is XLOG_BTREE_INSERT_POST