Re: [bug fix] prepared transaction might be lost when max_prepared_transactions is zero on the subscriber

From: shveta malik <shveta(dot)malik(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: [bug fix] prepared transaction might be lost when max_prepared_transactions is zero on the subscriber
Date: 2024-08-14 04:55:56
Message-ID: CAJpy0uAa3M9sC3GWTUA+YPTnYxEcorFt_7g6rXz05jNcAGdjgQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 13, 2024 at 9:48 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Mon, Aug 12, 2024 at 3:37 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> >
> > On Fri, Aug 9, 2024 at 2:39 PM Hayato Kuroda (Fujitsu)
> > <kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
> > >
> > >
> > > > + /*
> > > > + * Register a callback to reset the origin state before aborting the
> > > > + * transaction in ShutdownPostgres(). This is to prevent the advancement
> > > > + * of origin progress if the transaction failed to apply.
> > > > + */
> > > > + before_shmem_exit(replorigin_reset, (Datum) 0);
> > > >
> > > > I think we need this despite resetting the origin-related variables in
> > > > PG_CATCH block to handle FATAL error cases, right? If so, can we use
> > > > PG_ENSURE_ERROR_CLEANUP() instead of PG_CATCH()?
> > >
> > > There are two reasons to add a shmem-exit callback. One is to support a FATAL,
> > > another one is to support the case that user does the shutdown request while
> > > applying changes. In this case, I think ShutdownPostgres() can be called so that
> > > the session origin may advance.
> >
> > Agree that we need the 'reset' during shutdown flow as well. Details at [1]
> >
>
> Thanks for the detailed analysis. I agree with your analysis that we
> need to reset the origin information for the shutdown path to avoid it
> being advanced incorrectly. However, the patch doesn't have sufficient
> comments to explain why we need to reset it for both the ERROR and
> Shutdown paths. Can we improve the comments in the patch?
>
> Also, for the ERROR path, can we reset the origin information in
> apply_error_callback()?

Please find v4 attached. Addressed comments in that.

Manual testing done on v4:
1) Error and Fatal case
2) Shutdown after replorigin_session_origin_lsn was set in
apply_handle_prepare() and before EndPrepare was called.
2a) with log_min_messages=debug5. This will result in processing
of shutdown signal by errfinish() before PREPARE is over.
2b) with default log_min_messages. This will result in processing
of shutdown signal by LogicalRepApplyLoop() after ongoing PREPARE is
over.

>
> BTW, this needs to be backpatched till 16 when it has been introduced
> by the parallel apply feature as part of commit 216a7848. So, can we
> test this patch in back-branches as well?
>

Sure, will do next.

thanks
Shveta

Attachment Content-Type Size
v4-0001-Prevent-origin-progress-advancement-if-failed-to-.patch application/octet-stream 6.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2024-08-14 05:36:57 Re: Use pgBufferUsage for block reporting in analyze
Previous Message Ashutosh Bapat 2024-08-14 04:07:32 Re: PG_TEST_EXTRA and meson