Re: Giving the shared catalogues a defined encoding

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: Giving the shared catalogues a defined encoding
Date: 2024-12-06 18:51:48
Message-ID: 2841149.1733511108@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thomas Munro <thomas(dot)munro(at)gmail(dot)com> writes:
> Problem #1: You can have two databases with different encodings, and
> they both pretend that pg_database, pg_authid, pg_db_role_setting etc
> are in the local database encoding. That doesn't work too well:
> non-ASCII text can be reinterpreted in the wrong encoding.

> There's no problem if you only use one encoding everywhere (probably
> UTF8). There's also no problem if you use multiple database
> encodings, but put only ASCII in the shared catalogues (because ASCII
> is a subset of every supported server encoding). This patch is about
> formalising and enforcing those two working arrangements, hopefully
> invisibly to most users. There's still an escape hatch mode if you
> need it, e.g. for a non-conforming pg_upgrade'd system.

Over in the discussion of bug #18735, I've come to the realization
that these problems apply equally to the filesystem path names that
the server deals with: not only the data directory path, but the
path to the installation files [1]. Can we apply the same sort of
restrictions to those? I'm envisioning that initdb would check
either encoding-validity or all-ASCII-ness of those path names
depending on which mode it's setting the server up in.

> The patch invents a new setting CLUSTER CATALOG ENCODING, which can be
> inspected with SHOW and changed with ALTER SYSTEM.

Changing the catalog encoding would also have to re-verify the
suitability of the paths. Of course this isn't 100% bulletproof
since someone could rename those directories later. But I think
that's in "if you break it you get to keep both pieces" territory.

regards, tom lane

[1] https://www.postgresql.org/message-id/2840430.1733510664%40sss.pgh.pa.us

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2024-12-06 20:12:51 Re: Remove dependence on integer wrapping
Previous Message Jack Bay 2024-12-06 18:45:54 Support for unsigned integer types