Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, "Drouvot, Bertrand" <bdrouvot(at)amazon(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Oh, Mike" <minsoo(at)amazon(dot)com>
Subject: Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns
Date: 2022-05-23 05:39:41
Message-ID: CAA4eK1JoKV2qmp916gFk=9SX=Qo21+sN4n-yjbB2b0Q1xxOKJw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, May 23, 2022 at 10:03 AM Kyotaro Horiguchi
<horikyota(dot)ntt(at)gmail(dot)com> wrote:
>
> At Sat, 21 May 2022 15:35:58 +0530, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote in
> > I think if we don't have any better ideas then we should go with
> > either this or one of the other proposals in this thread. The other
> > idea that occurred to me is whether we can somehow update the snapshot
> > we have serialized on disk about this information. On each
> > running_xact record when we serialize the snapshot, we also try to
> > purge the committed xacts (via SnapBuildPurgeCommittedTxn). So, during
> > that we can check if there are committed xacts to be purged and if we
> > have previously serialized the snapshot for the prior running xact
> > record, if so, we can update it with the list of xacts that have
> > catalog changes. If this is feasible then I think we need to somehow
> > remember the point where we last serialized the snapshot (maybe by
> > using builder->last_serialized_snapshot). Even, if this is feasible we
> > may not be able to do this in back-branches because of the disk-format
> > change required for this.
> >
> > Thoughts?
>
> I didn't look it closer, but it seems to work. I'm not sure how much
> spurious invalidations at replication start impacts on performance,
> but it is promising if the impact is significant.
>

It seems Sawada-San's patch is doing at each commit not at the start
of replication and I think that is required because we need this each
time for replication restart. So, I feel this will be an ongoing
overhead for spurious cases with the current approach.

> That being said I'm
> a bit negative for doing that in post-beta1 stage.
>

Fair point. We can use the do it early in PG-16 if the approach is
feasible, and backpatch something on lines of what Sawada-San or you
proposed.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David G. Johnston 2022-05-23 05:41:34 Re: postgres_fdw has insufficient support for large object
Previous Message Peter Eisentraut 2022-05-23 05:38:32 Re: Convert macros to static inline functions