Re: "unexpected duplicate for tablespace" problem in logical replication

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: wangsh(dot)fnst(at)fujitsu(dot)com, osumi(dot)takamichi(at)fujitsu(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: "unexpected duplicate for tablespace" problem in logical replication
Date: 2022-04-19 06:23:37
Message-ID: CAD21AoBWj1=gab9xn+2q9PX9go4ovoaN9uFb2oAt7HTFe0PhJQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Tue, Apr 12, 2022 at 11:45 AM Kyotaro Horiguchi
<horikyota(dot)ntt(at)gmail(dot)com> wrote:
>
> At Mon, 11 Apr 2022 10:24:26 +0000, "wangsh(dot)fnst(at)fujitsu(dot)com" <wangsh(dot)fnst(at)fujitsu(dot)com> wrote in
> > Please see the attachment
>
> Sorry for being late, but this seems to be in the wrong direction.
>
> In the problem case, as you know, GetNewRelFileNode does *not* check
> oid uniqueness in pg_class. The on-catalog duplicate check is done
> only when the result is to be used as the *table oid* at creation. In
> other cases it chooses a relfilenode that does not duplicate in
> storage file names irrelevantly to the relation oid. As the result
> Non-temporary and temporary tables have separate relfilenode oid
> spaces, either intentional or not..
>
> Thus, I think we don't want GetNewRelFileNode to check for duplicate
> oid on pg_class if pg_class is not passed since that significantly
> narrow the oid space for relfilenode. If we did something like that,
> we might do check the existence of file names of the both non-temp and
> temp in GetNewRelFileNode(), but that also narrows the relfilenode oid
> space.

Agreed.

>
> RelidByRelfilenode is called by autoprewarm, logical replication, and
> pg_filenode_relation. In the autoprewarm and logical replication
> cases, it is called only for non-temprary relations so letting the
> function ignore (oid duplicating) temp tables works.
> pg_relfilienode_relation is somewhat dubious. It is affected by this
> bug. In the attached PoC, to avoid API change (in case for
> back-patching), RelidByRelfilenode ignores duplcates of differernt
> persistence. Also the PoC prioritize on persistent tables to
> temporary ones but I'm not sure it's good decision, but otherwise we
> need to change the signature of pg_filenode_relation.
>
> The attached is an PoC of that. The first one is a reproduction-aid
> code. With it applied, the following steps create duplicate
> relfilenode relations.
>
> What do you think about this direction?

Sounds good to me. If we go in this direction, probably we also need
to change the cache checks in RelidByRelfilenode() so that we
prioritize non-temporary relations. Otherwise, we will end up
returning the OID of temporary relation if it's already cached.

Another idea I came up with is that we have RelidByRelfilenode() check
only non-temporary relations by adding relpersistence !=
RELPERSISTENCE_TEMP to the scan key. If we want to support temporary
relations too, I think that, in v16, we can add relpersistence to
RelidByRelfilenode() as a function argument (and a flag to
pg_filenode_relation() accordingly) so that the user can get the name
of the temporary relation by like pg_filenode_relation(0, 16386,
true). On the other hand, in back branches, if an application is using
pg_filenode_relation() to get the name of temporary relations, it
would break it.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Takahiro Kitayama 2022-04-20 02:56:50 Re: BUG #16866: pg_basebackup Windows Server 20160
Previous Message PG Bug reporting form 2022-04-19 03:36:22 BUG #17466: Is it possible to supplement the default compilation options of CFLAGS in configure file?