Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Joe Conway <mail(at)joeconway(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, "Guo, Adam" <adamguo(at)amazon(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>
Subject: Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation
Date: 2024-09-25 22:46:27
Message-ID: CAD21AoCU7W-COMo+dtkx8Md60V1sY8u5NRqMX4opc_sGrY3q0w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 18, 2024 at 6:46 PM Noah Misch <noah(at)leadboat(dot)com> wrote:
>
> On Tue, Sep 17, 2024 at 09:43:41PM -0700, Masahiko Sawada wrote:
> > On Mon, Sep 16, 2024 at 9:24 AM Noah Misch <noah(at)leadboat(dot)com> wrote:
> > > On Thu, Sep 12, 2024 at 03:42:48PM -0700, Masahiko Sawada wrote:
> > > > On Tue, Sep 10, 2024 at 3:05 PM Noah Misch <noah(at)leadboat(dot)com> wrote:
> > > > > On Tue, Sep 10, 2024 at 05:56:47PM -0400, Tom Lane wrote:
> > > > > > Got it. So now I'm wondering if we need all the complexity of storing
> > > > > > stuff in the GIN metapages. Could we simply read the (primary's)
> > > > > > signedness out of pg_control and use that?
> > >
> > > > I've attached a PoC patch for this idea. We write the default char
> > > > signedness to the control file at initdb time. Then when comparing two
> > > > trgms, pg_trgm opclasses use a comparison function based on the char
> > > > signedness of the cluster. I've confirmed that the patch fixes the
> > > > reported case at least.
> > >
> > > I agree that proves the concept.
> >
> > Thanks. I like the simplicity of this approach. If we agree with this
> > approach, I'd like to proceed with it.
>
> Works for me.
>
> > Regardless of what approach we take, I wanted to provide some
> > regression tests for these changes, but I could not come up with a
> > reasonable idea. It would be great if we could do something like
> > 027_stream_regress.pl on cross-architecture replication. But just
> > executing 027_stream_regress.pl on cross-architecture replication
> > could not be sufficient since we would like to confirm query results
> > as well. If we could have a reasonable tool or way, it would also help
> > find other similar bugs related architectural differences.
>
> Perhaps add a regress.c function that changes the control file flag and
> flushes the change to disk?

I've tried this idea but found out that extensions can flush the
controlfile on the disk after flipping the char signedness value, they
cannot update the controlfile data on the shared memory. And, when the
server shuts down, the on-disk controlfile is updated with the
on-memory controlfile data, so the char signedness is reverted.

We can add a function to set the char signedness of on-memory
controlfile data or add a new option to pg_resetwal to change the char
signedness on-disk controlfile data but they seem to be overkill to me
and I'm concerned about misusing these option and function.

I've updated and splitted patches. They don't have regression tests yet.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v1-0002-Fix-an-issue-with-index-scan-using-pg_trgm-due-to.patch application/octet-stream 4.9 KB
v1-0001-Add-default_char_signedness-to-ControlFileData.patch application/octet-stream 7.2 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2024-09-26 00:04:18 Re: Add contrib/pg_logicalsnapinspect
Previous Message Florents Tselai 2024-09-25 22:28:16 Re: Docs pg_restore: Shouldn't there be a note about -n ?