Re: Collation versioning

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Douglas Doole <dougdoole(at)gmail(dot)com>, Christoph Berg <myon(at)debian(dot)org>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Collation versioning
Date: 2019-11-28 13:08:44
Message-ID: CAOBaU_ZC0ynR2s8mt72wyfmDu78z__hYmcGm25e=z6Dgh4Fvag@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Nov 28, 2019 at 5:50 AM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Wed, Nov 27, 2019 at 04:09:57PM -0500, Robert Haas wrote:
> > Yeah, I like #3 too. If we're going to the trouble to build all of
> > this mechanism, it seems worth it to build the additional machinery to
> > be precise about the difference between "looks like a problem" and "we
> > don't know".
>
> Indeed, #3 sounds like a sensible way of doing things. The two others
> may cause random problems which are harder to actually detect and act
> on as we should avoid as much as possible a forced system-wide REINDEX
> after an upgrade to a post-13 PG.

Thanks everyone for the feedback! Since there seems to be an
agreement on #3, here's a proposal.

What we could do is storing an empty string if the compatibility is
unknown, and detect it in index_check_collation_version() to report a
slightly different message. I'm assuming that not knowing the
compatibility would be system-wide rather than per collation, so we
could use an sql query like this:

ALTER INDEX idx_name DEPENDS ON COLLATION UNKNOWN VERSION

If adding (un)reserved keywords is not an issue, we could also instead
use something along ALTER INDEX idx_name DEPENDS ON ALL COLLATIONS
and/or ALL VERSIONS UNKNOWN, or switch to:

ALTER INDEX idx_name ALTER [ COLLATION coll_name | ALL COLLATIONS ]
DEPENDS ON [ UNKOWN VERSION | VERSION 'version_string' ]

Obviously, specific versions would require a specific collation, and
at least UNKNOWN VERSION would only be allowed in binary upgrade mode,
and not documented. I have also some other ideas for additional DDL
to let users deal with catalog update after a compatible collation
library upgrade, but let's deal with that later.

Then for pg_upgrade, we can add a --collations-are-binary-compatible
switch or similar:

If upgrading from pre-v13
- without the switch, we'd generate the VERSION UNKNOWN for all
indexes during pg_dump in upgrade_mode
- with the switch, do nothing as all indexes would already be
pointing to the currently installed version

If upgrading from post v13, the switch shouldn't be useful as versions
will be restored, and if there was a collation library upgrade it
should be handled manually, same as if such an upgrade is done without
pg_upgrade-ing the cluster. I'd personally disallow it to avoid users
to shoot themselves in the foot.

Does this sounds sensible?

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Rafia Sabih 2019-11-28 13:23:21 Re: How to prohibit parallel scan through tableam?
Previous Message Kyotaro Horiguchi 2019-11-28 12:37:03 Re: Remove page-read callback from XLogReaderState.