Re: Logical Replication of sequences

From: vignesh C <vignesh21(at)gmail(dot)com>
To: shveta malik <shveta(dot)malik(at)gmail(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Euler Taveira <euler(at)eulerto(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Hou, Zhijie/侯 志杰 <houzj(dot)fnst(at)fujitsu(dot)com>, "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>
Subject: Re: Logical Replication of sequences
Date: 2024-08-09 13:12:41
Message-ID: CALDaNm0LJCtGoBCO6DFY-RDjR8vxapW3W1f7=-LSQx=XYjqU=w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 8 Aug 2024 at 12:21, shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> On Wed, Aug 7, 2024 at 1:45 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> >
> >
> > The remaining comments have been addressed, and the changes are
> > included in the attached v20240807 version patch.
>
> Thanks for addressing the comment. Please find few comments for v20240807 :
>
> patch003:
> 2)
>
> * The page_lsn allows the user to determine if the sequence has been updated
> * since the last synchronization with the subscriber. This is done by
> * comparing the current page_lsn with the value stored in pg_subscription_rel
> * from the last synchronization.
> */
> Datum
> pg_sequence_state(PG_FUNCTION_ARGS)
>
> --This information is still incomplete. Maybe we should mention the
> other attribute name as well which helps to determine this.

I have removed this comment now as suggesting that users use
pg_sequence_state and sequence when page_lsn seems complex, the same
can be achieved by comparing the sequence values from a single
statement instead of a couple of statements. Peter had felt this would
be easier based on comment 3c at [1].

> 5)
> IIUC, sequencesync_failure_time is changed by multiple processes.
> Seq-sync worker sets it before exiting on failure, while apply worker
> resets it. Also, the applied worker reads it at a few places. Shall it
> be accessed using LogicalRepWorkerLock?

If sequenceApply worker is already running, apply worker will not
access sequencesync_failure_time. Only if sequence sync worker is not
running apply worker will access sequencesync_failure_time in the
below code. I feel no need to use LogicalRepWorkerLock in this case.

...
syncworker = logicalrep_worker_find(MyLogicalRepWorker->subid,
InvalidOid, WORKERTYPE_SEQUENCESYNC,
true);
if (syncworker)
{
/* Now safe to release the LWLock */
LWLockRelease(LogicalRepWorkerLock);
break;
}

/*
* Count running sync workers for this subscription, while we have the
* lock.
*/
nsyncworkers = logicalrep_sync_worker_count(MyLogicalRepWorker->subid);

/* Now safe to release the LWLock */
LWLockRelease(LogicalRepWorkerLock);

/*
* If there are free sync worker slot(s), start a new sequence sync
* worker, and break from the loop.
*/
if (nsyncworkers < max_sync_workers_per_subscription)
{
TimestampTz now = GetCurrentTimestamp();

if (!MyLogicalRepWorker->sequencesync_failure_time ||
TimestampDifferenceExceeds(MyLogicalRepWorker->sequencesync_failure_time,
now, wal_retrieve_retry_interval))
{
MyLogicalRepWorker->sequencesync_failure_time = 0;

logicalrep_worker_launch(WORKERTYPE_SEQUENCESYNC,
MyLogicalRepWorker->dbid,
MySubscription->oid,
MySubscription->name,
MyLogicalRepWorker->userid,
InvalidOid,
DSM_HANDLE_INVALID);
break;
}
}
...

> 6)
> process_syncing_sequences_for_apply():
>
> --I feel MyLogicalRepWorker->sequencesync_failure_time should be reset
> to 0 after we are sure that logicalrep_worker_launch() has launched
> the worker without any error. But not sure what could be the clean way
> to do it? If we move it after logicalrep_worker_launch() call, there
> are chances that seq-sync worker has started and failed already and
> has set this failure time which will then be mistakenly reset by apply
> worker. Also moving it inside logicalrep_worker_launch() does not seem
> a good way.

I felt we can keep it in the existing way to keep it consistent with
table sync worker restart like in process_syncing_tables_for_apply.

The rest of the comments are fixed. The rest of the comments are
fixed in the v20240809 version patch attached.

[1] - https://www.postgresql.org/message-id/CAHut%2BPvaq%3D0xsDWdVQ-kdjRa8Az%2BvgiMFTvT2E2nR3N-47TO8A%40mail.gmail.com

Regards,
Vignesh

Attachment Content-Type Size
v20240809-0003-Enhance-sequence-synchronization-during-su.patch text/x-patch 91.4 KB
v20240809-0004-Documentation-for-sequence-synchronization.patch text/x-patch 24.8 KB
v20240809-0001-Introduce-pg_sequence_state-function-for-e.patch text/x-patch 11.4 KB
v20240809-0002-Introduce-ALL-SEQUENCES-support-for-Postgr.patch text/x-patch 90.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Melanie Plageman 2024-08-09 13:15:01 Re: Add LSN <-> time conversion functionality
Previous Message Melanie Plageman 2024-08-09 13:09:27 Re: Add LSN <-> time conversion functionality