Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Noah Misch <noah(at)leadboat(dot)com>, Joe Conway <mail(at)joeconway(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, "Guo, Adam" <adamguo(at)amazon(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>
Subject: Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation
Date: 2024-09-10 18:31:46
Message-ID: CAD21AoDGs0F8hu9sk0JHMk32FSJ1rzy9HUfivYhd4O0r=6QGHg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Sep 9, 2024 at 11:25 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> writes:
> > On Mon, Sep 9, 2024 at 4:42 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> Do you have an idea for how we'd get
> >> this to happen during pg_upgrade, exactly?
>
> > What I was thinking is that we have "pg_dump --binary-upgrade" emit a
> > function call, say "SELECT binary_upgrade_update_gin_meta_page()" for
> > each GIN index, and the meta pages are updated when restoring the
> > schemas.
>
> Hmm, but ...
>
> 1. IIRC we don't move the relation files into the new cluster until
> after we've run the schema dump/restore step. I think this'd have to
> be driven in some other way than from the pg_dump output. I guess we
> could have pg_upgrade start up the new postmaster and call a function
> in each DB, which would have to scan for GIN indexes by itself.

You're right.

>
> 2. How will this interact with --link mode? I don't see how it
> doesn't involve scribbling on files that are shared with the old
> cluster, leading to possible problems if the pg_upgrade fails later
> and the user tries to go back to using the old cluster. It's not so
> much the metapage update that is worrisome --- we're assuming that
> that will modify storage that's unused in old versions. But the
> change would be unrecorded in the old cluster's WAL, which sounds
> risky.
>
> Maybe we could get away with forcing --copy mode for affected
> indexes, but that feels a bit messy. We'd not want to do it
> for unaffected indexes, so the copying code would need to know
> a great deal about this problem.

Good point. I agree that it would not be a very good idea to force --copy mode.

An alternative way would be that we store the char signedness in the
control file, and gin_trgm_ops opclass reads it if the bytes in the
meta page shows 'unset'. The char signedness in the control file
doesn't mean to be used for the compatibility check for physical
replication but used as a hint. But it also could be a bit messy,
though.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jacob Champion 2024-09-10 18:49:38 Re: libpq: Process buffered SSL read bytes to support records >8kB on async API
Previous Message Daniel Gustafsson 2024-09-10 18:25:02 Re: Converting README documentation to Markdown