Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Joe Conway <mail(at)joeconway(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, "Guo, Adam" <adamguo(at)amazon(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>
Subject: Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation
Date: 2024-09-06 22:36:53
Message-ID: 2385726.1725662213@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Noah Misch <noah(at)leadboat(dot)com> writes:
> Yes, that's one way to make it work. If we do it that way, it would be
> critical to make the ALTER EXTENSION UPDATE run before anything uses the
> index. Otherwise, we'll run new v18 "signed char" code on a v17 "unsigned
> char" data file. Running ALTER EXTENSION UPDATE early enough should be
> feasible, so that's fine. Some other options:

> - If v18 "pg_dump -b" decides to emit CREATE INDEX ... gin_trgm_ops_unsigned,
> then make it also emit the statements to create the opclass.

> - Ship pg_trgm--1.6--1.7.sql in back branches. If the upgrade wants to use
> gin_trgm_ops_unsigned, require the user to ALTER EXTENSION UPDATE first.
> (In back branches, the C code behind gin_trgm_ops_unsigned could just raise
> an error if called.)

I feel like all of these are leaning heavily on users to get it right,
and therefore have a significant chance of breaking use-cases that
were perfectly fine before.

Remind me of why we can't do something like this:

* Add APIs that allow opclasses to read/write some space in the GIN
metapage. (We could get ambitious and add such APIs for other AMs
too, but doing it just for GIN is probably a prudent start.) Ensure
that this space is initially zeroed.

* In gin_trgm_ops, read a byte of this space and interpret it as
0 = unset
1 = signed chars
2 = unsigned chars
If the value is zero, set the byte on the basis of the native
char-signedness of the current platform. (Obviously, a
secondary server couldn't write the byte, and would have to
wait for the primary to update the value. In the meantime,
it's no more broken than today.)

* Subsequently, use either signed or unsigned comparisons
based on that value.

This would automatically do the right thing in all cases that
work today, and it'd provide the ability for cross-platform
replication to work in future. You can envision cases where
transferring a pre-existing index to a platform of the other
stripe would misbehave, but they're the same cases that fail
today, and the fix remains the same: reindex.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jelte Fennema-Nio 2024-09-06 22:40:12 Re: PostgreSQL 17 release announcement draft
Previous Message Jelte Fennema-Nio 2024-09-06 22:30:31 Re: Detailed release notes