From: | shveta malik <shveta(dot)malik(at)gmail(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com> |
Subject: | Re: [bug fix] prepared transaction might be lost when max_prepared_transactions is zero on the subscriber |
Date: | 2024-08-08 10:01:19 |
Message-ID: | CAJpy0uD64yWnGsafYQU=VvuL8kVcen2iMRxf6ReRZcTtqDZ16A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Aug 8, 2024 at 12:03 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Thu, Aug 8, 2024 at 10:37 AM Hayato Kuroda (Fujitsu)
> <kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
> >
> ...
> >
> > An easiest fix is to reset session replication origin before calling the
> > RecordTransactionAbort(). I think this can happen when 1) LogicalRepApplyLoop()
> > raises an ERROR or 2) apply worker exits. Attached patch can fix the issue on HEAD.
> >
>
> Few comments:
> =============
> *
> @@ -4409,6 +4409,14 @@ start_apply(XLogRecPtr origin_startpos)
> }
> PG_CATCH();
> {
> + /*
> + * Reset the origin data to prevent the advancement of origin progress
> + * if the transaction failed to apply.
> + */
> + replorigin_session_origin = InvalidRepOriginId;
> + replorigin_session_origin_lsn = InvalidXLogRecPtr;
> + replorigin_session_origin_timestamp = 0;
>
> Can't we call replorigin_reset() instead here?
>
> *
> + /*
> + * Register a callback to reset the origin state before aborting the
> + * transaction in ShutdownPostgres(). This is to prevent the advancement
> + * of origin progress if the transaction failed to apply.
> + */
> + before_shmem_exit(replorigin_reset, (Datum) 0);
>
> I think we need this despite resetting the origin-related variables in
> PG_CATCH block to handle FATAL error cases, right? If so, can we use
> PG_ENSURE_ERROR_CLEANUP() instead of PG_CATCH()?
+1
Basic tests work fine on this patch. Just thinking out loud,
SetupApplyOrSyncWorker() is called for table-sync worker as well and
IIUC tablesync worker does not deal with 2PC txns. So do we even need
to register replorigin_reset() for tablesync worker as well? If we may
hit such an issue in general, then perhaps we need it in table-sync
worker otherwise not. It needs some investigation. Thoughts?
thanks
Shveta
From | Date | Subject | |
---|---|---|---|
Next Message | Kirill Reshke | 2024-08-08 10:03:08 | Re: Incremental View Maintenance, take 2 |
Previous Message | Kirill Reshke | 2024-08-08 09:37:54 | Re: Incremental View Maintenance, take 2 |