Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Joe Conway <mail(at)joeconway(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, "Guo, Adam" <adamguo(at)amazon(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>
Subject: Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation
Date: 2025-02-18 21:23:20
Message-ID: CAD21AoBgJGOmH7aNLS1FdeOiCh3fqjjmaWkvU-pwwiXdWXCC6g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Feb 17, 2025 at 2:57 PM Noah Misch <noah(at)leadboat(dot)com> wrote:
>
> On Fri, Jan 17, 2025 at 05:11:41PM -0800, Masahiko Sawada wrote:
> > Thank you for reviewing the patches. I agree with all the comments you
> > made. I've addressed them and I've attached new version patches that
> > now have some regression tests.

Thank you for reviewing the patches.

> > Subject: [PATCH v3 4/5] pg_upgrade: Add --set-char-signedness to set the
> > default char signedness of new cluster.
> >
> > This change adds a new option --set-char-signedness to pg_upgrade. It
> > enables user to set arbitrary signedness during pg_upgrade. This helps
> > cases where user who knew they copied the v17 source cluster from
> > x86 (signedness=true) to ARM (signedness=falese) can pg_upgrade
>
> s/falese/false/

Fixed.

>
> > --- a/doc/src/sgml/ref/pgupgrade.sgml
> > +++ b/doc/src/sgml/ref/pgupgrade.sgml
> > @@ -276,6 +276,54 @@ PostgreSQL documentation
> > </listitem>
> > </varlistentry>
> >
> > + <varlistentry>
> > + <term><option>--set-char-signedness=</option><replaceable>option</replaceable></term>
>
> If upgrading from v18+, valid use cases for this option are rare. I think
> using the option should raise an error for v18+ source. A user who knows what
> they're doing can use pg_resetwal manually, before the upgrade. This page
> shouldn't refer to the pg_resetwal flag specifically, because that need is so
> rare. What do you think?

Makes sense. I've added that check and updated the regression tests accordingly.

>
> > + <listitem>
> > + <para>
> > + Manually set the default char signedness of new clusters. Possible values
> > + are <literal>signed</literal> and <literal>unsigned</literal>.
> > + </para>
> > + <para>
> > + In the C language, the default signedness of the <type>char</type> type
> > + (when not explicitly specified) varies across platforms. For example,
> > + <type>char</type> defaults to <type>signed char</type> on x86 CPUs but
> > + to <type>unsigned char</type> on ARM CPUs. When data stored using the
> > + <type>char</type> type is persisted to disk, such as in GIN indexes,
> > + this platform-dependent behavior results in incorrect data comparisons
> > + in two scenarios:
> > + </para>
> > + <itemizedlist>
> > + <listitem>
> > + <para>
> > + When data is moved between platforms with different char signedness.
> > + </para>
> > + </listitem>
> > + <listitem>
> > + <para>
> > + When data is replicated using streaming replication across different architectures.
> > + </para>
> > + </listitem>
> > + </itemizedlist>
>
> After your patch series, core PostgreSQL persistent data doesn't depend on
> "char" signedness. Extensions should also start calling
> GetDefaultCharSignedness() to become independent of "char" signedness. Hence,
> let's remove the part starting at "When data stored using the char type" and
> ending here. Future users don't need to understand why pre-v18 had a problem.
> They mainly need to know how to set this flag.

Removed.

> If we wanted to say something about the breakage before v18, it might be, "If
> a pre-v18 cluster has trgm indexes and ran on different platforms at different
> times in the history of those indexes, REINDEX those indexes before or after
> running pg_upgrade." I'd put that in the release notes, not here in the
> pg_upgrade doc page.

Agreed.

>
> > + <para>
> > + Starting from <productname>PostgreSQL</productname> 18, database clusters
> > + maintain their own default char signedness setting, which can be used as
> > + a hint to ensure consistent behavior across platforms with different
>
> Setting this wrong makes your trgm indexes return wrong answers, so it's not a
> "hint". (I think "hint" implies minor consequences for getting it wrong.)

Fixed.

> > + default char signedness. By default, <application>pg_upgrade</application>
> > + preserves the char signedness setting when upgrading from an existing cluster.
> > + However, when upgrading from <productname>PostgreSQL</productname> 17 or
> > + earlier, <application>pg_upgrade</application> adopts the char signedness
> > + of the platform on which it was built.
> > + </para>
> > + <para>
> > + This option allows you to explicitly set the default char signedness for
> > + the new cluster, overriding any inherited values. This is particularly useful
> > + when you plan to migrate the upgraded cluster to a platform with different
> > + char signedness (for example, when moving from an x86-based system to an
> > + ARM-based system).
>
> Let's refine the terminology around "plan to migrate". Here's an informal
> description (not documentation-quality):
>
> 1. If you "plan to migrate" but have not yet done so, don't use this flag.
> The default behavior is right. Upgrade on the old platform without using
> this flag, and then migrate. This is the most-foolproof approach.
>
> 2. If you already migrated the cluster data files to a different platform just
> before running pg_upgrade, use this flag. It's essential not to modify any
> trgm indexes between migrating the data files and running pg_upgrade. Best
> to make pg_upgrade the first action that starts the cluster on the new
> platform, not starting the cluster manually. This is trickier than (1).

Updated this part.

>
> > --- a/src/bin/pg_upgrade/pg_upgrade.c
> > +++ b/src/bin/pg_upgrade/pg_upgrade.c
> > @@ -399,8 +399,14 @@ set_new_cluster_char_signedness(void)
> > {
> > bool new_char_signedness;
> >
> > - /* Inherit the source database's signedness */
> > - new_char_signedness = old_cluster.controldata.default_char_signedness;
> > + /*
> > + * Use the specified char signedness if specifies. Otherwise we inherit
>
> s/if specifies/if specified/

Fixed.

>
> > --- a/contrib/pg_trgm/trgm.h
> > +++ b/contrib/pg_trgm/trgm.h
> > @@ -41,14 +41,12 @@
> > typedef char trgm[3];
> >
> > #define CMPCHAR(a,b) ( ((a)==(b)) ? 0 : ( ((a)<(b)) ? -1 : 1 ) )
> > -#define CMPPCHAR(a,b,i) CMPCHAR( *(((const char*)(a))+i), *(((const char*)(b))+i) )
> > -#define CMPTRGM(a,b) ( CMPPCHAR(a,b,0) ? CMPPCHAR(a,b,0) : ( CMPPCHAR(a,b,1) ? CMPPCHAR(a,b,1) : CMPPCHAR(a,b,2) ) )
>
> I recommend moving CMPCHAR into the C file along with the two related macros
> that you're moving.

Moved.

>
> > --- a/contrib/pg_trgm/trgm_op.c
> > +++ b/contrib/pg_trgm/trgm_op.c
> > @@ -42,6 +42,9 @@ PG_FUNCTION_INFO_V1(strict_word_similarity_commutator_op);
> > PG_FUNCTION_INFO_V1(strict_word_similarity_dist_op);
> > PG_FUNCTION_INFO_V1(strict_word_similarity_dist_commutator_op);
> >
> > +static inline int CMPTRGM_CHOOSE(const void *a, const void *b);
>
> All instances of the "inline" keyword in this patch series are on functions
> called only via function pointers. Hence, those functions are never inlined.
> I'd remove the keyword.

Removed.

>
> > --- a/src/bin/pg_upgrade/pg_upgrade.h
> > +++ b/src/bin/pg_upgrade/pg_upgrade.h
> > @@ -125,6 +125,12 @@ extern char *output_files[];
> > */
> > #define JSONB_FORMAT_CHANGE_CAT_VER 201409291
> >
> > +/*
> > + * The control file was changed to have the default char signedness,
> > + * commit XXXXX.
> > + */
> > +#define DEFAULT_CHAR_SIGNEDNESS_VAT_VER 202501161
>
> s/VAT/CAT/

Fixed.

>
> > +command_like(
> > + [ 'pg_controldata', $old->data_dir ],
> > + qr/Default char data signedness:\s+signed/,
> > + 'default char signedness of old cluster is signed in control file');
> > +command_like(
> > + [ 'pg_controldata', $new->data_dir ],
> > + qr/Default char data signedness:\s+signed/,
> > + 'default char signedness is new cluster signed in control file');
>
> s/is new cluster signed/of new cluster is signed/ for consistency with the
> previous test name.

Fixed.

>
> > +
> > +# Set the old cluster's default char signedness to unsigned for test.
> > +command_ok(
> > + [
> > + 'pg_resetwal', '--char-signedness', 'unsigned',
> > + '-f', $old->data_dir
> > + ],
> > + "set old cluster's default char signeness to unsigned");
>
> s/signeness/signedness/

Fixed.

>
> > Subject: [PATCH v3 2/5] pg_resetwal: Add --char-signedness option to change
> > the default char signedness.
> >
> > With the newly added option --char-signedness, pg_resetwal updates the
> > default char signeness flag in the controlfile.
>
> s/signeness/signedness/

Fixed.

>
> > --- a/doc/src/sgml/ref/pg_resetwal.sgml
> > +++ b/doc/src/sgml/ref/pg_resetwal.sgml
> > @@ -171,6 +171,20 @@ PostgreSQL documentation
> > </para>
> >
> > <variablelist>
> > + <varlistentry>
> > + <term><option>--char-signedness=<replaceable class="parameter">option</replaceable></option></term>
> > + <listitem>
> > + <para>
> > + Manually set the default char signedness. Possible values are
> > + <literal>signed</literal> and <literal>unsigned</literal>.
> > + </para>
> > + <para>
> > + This option can also be used to change the default char signedness
> > + of an existing database cluster.
> > + </para>
> > + </listitem>
> > + </varlistentry>
>
> Each other list entry discusses how to choose a safe value, so this should say
> something like that. For users in doubt, nothing other than pg_upgrade should
> use this flag. Theoretically, if you just ran pg_upgrade with the wrong value
> and had not yet modified any trgm indexes, you could run this to fix the
> problem. That's too rare of a situation and too dangerous to document, so the
> documentation for this flag should discourage using the flag.

Updated the documentation.

>
> > --- a/src/bin/pg_resetwal/pg_resetwal.c
> > +++ b/src/bin/pg_resetwal/pg_resetwal.c
> > @@ -105,6 +106,7 @@ main(int argc, char *argv[])
> > {"multixact-offset", required_argument, NULL, 'O'},
> > {"oldest-transaction-id", required_argument, NULL, 'u'},
> > {"next-transaction-id", required_argument, NULL, 'x'},
> > + {"char-signedness", required_argument, NULL, 2},
> > {"wal-segsize", required_argument, NULL, 1},
>
> Put the ", 2}" entry after the ", 1}" entry. (We alphabetize in usage(), but
> we append to the end in these C arrays.)

Fixed.

>
> > @@ -1189,6 +1213,7 @@ usage(void)
> > printf(_(" -u, --oldest-transaction-id=XID set oldest transaction ID\n"));
> > printf(_(" -x, --next-transaction-id=XID set next transaction ID\n"));
> > printf(_(" --wal-segsize=SIZE size of WAL segments, in megabytes\n"));
> > + printf(_(" --char-signedness=OPTION set char signedness to \"signed\" or \"unsigned\"\n"));
>
> This --help output alphabetizes by short option, with the one non-short option
> at the end. So I'd also alphabetize among the non-short options, specifically
> putting the new line before --wal-segsize.

Fixed.

I've attached the updated patches.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v4-0003-pg_upgrade-Preserve-default-char-signedness-value.patch application/octet-stream 8.6 KB
v4-0004-pg_upgrade-Add-set-char-signedness-to-set-the-def.patch application/octet-stream 9.1 KB
v4-0002-pg_resetwal-Add-char-signedness-option-to-change-.patch application/octet-stream 5.2 KB
v4-0005-Fix-an-issue-with-index-scan-using-pg_trgm-due-to.patch application/octet-stream 4.3 KB
v4-0001-Add-default_char_signedness-field-to-ControlFileD.patch application/octet-stream 8.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2025-02-18 22:02:25 Re: Optimization for lower(), upper(), casefold() functions.
Previous Message Bruce Momjian 2025-02-18 21:15:11 Re: Commitfest app release on Feb 17 with many improvements