From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Noah Misch <noah(at)leadboat(dot)com> |
Cc: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Joe Conway <mail(at)joeconway(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, "Guo, Adam" <adamguo(at)amazon(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com> |
Subject: | Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation |
Date: | 2024-09-06 22:36:53 |
Message-ID: | 2385726.1725662213@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Noah Misch <noah(at)leadboat(dot)com> writes:
> Yes, that's one way to make it work. If we do it that way, it would be
> critical to make the ALTER EXTENSION UPDATE run before anything uses the
> index. Otherwise, we'll run new v18 "signed char" code on a v17 "unsigned
> char" data file. Running ALTER EXTENSION UPDATE early enough should be
> feasible, so that's fine. Some other options:
> - If v18 "pg_dump -b" decides to emit CREATE INDEX ... gin_trgm_ops_unsigned,
> then make it also emit the statements to create the opclass.
> - Ship pg_trgm--1.6--1.7.sql in back branches. If the upgrade wants to use
> gin_trgm_ops_unsigned, require the user to ALTER EXTENSION UPDATE first.
> (In back branches, the C code behind gin_trgm_ops_unsigned could just raise
> an error if called.)
I feel like all of these are leaning heavily on users to get it right,
and therefore have a significant chance of breaking use-cases that
were perfectly fine before.
Remind me of why we can't do something like this:
* Add APIs that allow opclasses to read/write some space in the GIN
metapage. (We could get ambitious and add such APIs for other AMs
too, but doing it just for GIN is probably a prudent start.) Ensure
that this space is initially zeroed.
* In gin_trgm_ops, read a byte of this space and interpret it as
0 = unset
1 = signed chars
2 = unsigned chars
If the value is zero, set the byte on the basis of the native
char-signedness of the current platform. (Obviously, a
secondary server couldn't write the byte, and would have to
wait for the primary to update the value. In the meantime,
it's no more broken than today.)
* Subsequently, use either signed or unsigned comparisons
based on that value.
This would automatically do the right thing in all cases that
work today, and it'd provide the ability for cross-platform
replication to work in future. You can envision cases where
transferring a pre-existing index to a platform of the other
stripe would misbehave, but they're the same cases that fail
today, and the fix remains the same: reindex.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Jelte Fennema-Nio | 2024-09-06 22:40:12 | Re: PostgreSQL 17 release announcement draft |
Previous Message | Jelte Fennema-Nio | 2024-09-06 22:30:31 | Re: Detailed release notes |