Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation

From: Noah Misch <noah(at)leadboat(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Joe Conway <mail(at)joeconway(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, "Guo, Adam" <adamguo(at)amazon(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>
Subject: Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation
Date: 2024-09-06 23:25:39
Message-ID: 20240906232539.47.nmisch@google.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 06, 2024 at 06:36:53PM -0400, Tom Lane wrote:
> Noah Misch <noah(at)leadboat(dot)com> writes:
> > Yes, that's one way to make it work. If we do it that way, it would be
> > critical to make the ALTER EXTENSION UPDATE run before anything uses the
> > index. Otherwise, we'll run new v18 "signed char" code on a v17 "unsigned
> > char" data file. Running ALTER EXTENSION UPDATE early enough should be
> > feasible, so that's fine. Some other options:
>
> > - If v18 "pg_dump -b" decides to emit CREATE INDEX ... gin_trgm_ops_unsigned,
> > then make it also emit the statements to create the opclass.
>
> > - Ship pg_trgm--1.6--1.7.sql in back branches. If the upgrade wants to use
> > gin_trgm_ops_unsigned, require the user to ALTER EXTENSION UPDATE first.
> > (In back branches, the C code behind gin_trgm_ops_unsigned could just raise
> > an error if called.)
>
> I feel like all of these are leaning heavily on users to get it right,

What do you have in mind? I see things for the pg_upgrade implementation to
get right, but I don't see things for pg_upgrade users to get right.

The last option is disruptive for users of unsigned char platforms, since it
requires a two-step upgrade if upgrading from v11 or earlier. The other two
don't have that problem.

> and therefore have a significant chance of breaking use-cases that
> were perfectly fine before.
>
> Remind me of why we can't do something like this:
>
> * Add APIs that allow opclasses to read/write some space in the GIN
> metapage. (We could get ambitious and add such APIs for other AMs
> too, but doing it just for GIN is probably a prudent start.) Ensure
> that this space is initially zeroed.
>
> * In gin_trgm_ops, read a byte of this space and interpret it as
> 0 = unset
> 1 = signed chars
> 2 = unsigned chars
> If the value is zero, set the byte on the basis of the native
> char-signedness of the current platform. (Obviously, a
> secondary server couldn't write the byte, and would have to
> wait for the primary to update the value. In the meantime,
> it's no more broken than today.)
>
> * Subsequently, use either signed or unsigned comparisons
> based on that value.
>
> This would automatically do the right thing in all cases that
> work today, and it'd provide the ability for cross-platform
> replication to work in future. You can envision cases where
> transferring a pre-existing index to a platform of the other
> stripe would misbehave, but they're the same cases that fail
> today, and the fix remains the same: reindex.

No reason you couldn't do it that way. Works for me.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-09-06 23:37:09 Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation
Previous Message Jim Jones 2024-09-06 22:46:06 Re: [BUG?] XMLSERIALIZE( ... INDENT) won't work with blank nodes