Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Noah Misch <noah(at)leadboat(dot)com>, Joe Conway <mail(at)joeconway(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, "Guo, Adam" <adamguo(at)amazon(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>
Subject: Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation
Date: 2024-10-03 13:55:47
Message-ID: CAD21AoDmO_R_vDL-UfJXywcHNY6YrgeMPsnAj=8Pt6xrFjfTWg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Oct 2, 2024 at 10:02 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Tue, Oct 1, 2024 at 8:57 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >
> > Noah Misch <noah(at)leadboat(dot)com> writes:
> > > On Tue, Oct 01, 2024 at 11:55:48AM -0700, Masahiko Sawada wrote:
> > >> Considering that the population of database cluster signedness will
> > >> converge to signedness=true in the future, we can consider using
> > >> -fsigned-char to prevent similar problems for the future. We need to
> > >> think about possible side-effects as well, though.
> >
> > > It's good to think about -fsigned-char. While I find it tempting, several
> > > things would need to hold for us to benefit from it:
> >
> > > - Every supported compiler has to offer it or an equivalent.
> > > - The non-compiler parts of every supported C implementation need to
> > > cooperate. For example, CHAR_MIN must change in response to the flag. See
> > > the first comment in cash_in().
> > > - Libraries we depend on can't do anything incompatible with it.
> >
> > > Given that, I would lean toward not using -fsigned-char. It's unlikely all
> > > three things will hold. Even if they do, the benefit is not large.
> >
> > I am very, very strongly against deciding that Postgres will only
> > support one setting of char signedness. It's a step on the way to
> > hardware monoculture, and we know where that eventually leads.
> > (In other words, I categorically reject Sawada-san's assertion
> > that signed chars will become universal. I'd reject the opposite
> > assertion as well.)
>
> Thank you for pointing this out. I agree with both of you.
>

I've attached PoC patches for the idea Noah proposed. Newly created
clusters unconditionally have default_char_signedness=true, and the
only source of signedness=false is pg_upgrade. To update the
signedness in the controlfile, pg_resetwal now has a new option
--char-signedness, which is used by pg_upgrade internally. Feedback is
very welcome.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v2-0005-Fix-an-issue-with-index-scan-using-pg_trgm-due-to.patch application/octet-stream 4.9 KB
v2-0004-pg_upgrade-Add-set-char-signedness-to-set-the-def.patch application/octet-stream 5.1 KB
v2-0001-Add-default_char_signedness-field-to-ControlFileD.patch application/octet-stream 8.7 KB
v2-0003-pg_upgrade-Inherit-default-char-signedness-from-o.patch application/octet-stream 5.5 KB
v2-0002-pg_resetwal-Add-char-signedness-option-to-change-.patch application/octet-stream 4.7 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message jian he 2024-10-03 14:16:00 make \d table Collation field showing domains collation if that attribute is type of domain.
Previous Message Alena Rybakina 2024-10-03 13:36:50 Re: On disable_cost