Re: BUG #18735: Specific multibyte character in psql file path command parameter for Windows

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Tatsuo Ishii <ishii(at)postgresql(dot)org>
Cc: koichi(dot)dbms(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18735: Specific multibyte character in psql file path command parameter for Windows
Date: 2024-12-06 20:14:37
Message-ID: 2850994.1733516077@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

I wrote:
> Anyway, what I'm now thinking is that we can have two variants
> of canonicalize_path:
> extern void canonicalize_path(char *path);
> extern void canonicalize_path_enc(char *path, int encoding);
> The first one assumes a server-safe encoding, the second doesn't,
> and at least to start with only psql would bother with the second.

I thought that part would be trivial, but there's a small annoying
problem. The obvious way to write the encoding-aware version of
the de-backslashing loop is to use pg_encoding_mblen_bounded().
However, that function is in src/common/wchar.c while path.c
is in src/port/ --- and I believe we have a rule that libpgport
can't depend on libpgcommon. (The dependencies go the other way,
instead.)

Now it's pretty dubious that path.c is in src/port/ at all, because
it does not meet the expectation that that directory is for
functions that replace missing system-library functionality.
(We've trodden pretty hard on that expectation over the years,
but whatever.) So one reasonable fix could be to move path.c
to src/common, but I'm concerned that that would be unsafe to
back-patch. Also, we'd really want to move the externs for
path.c out of port.h, which would cause additional code churn
for callers.

Another way, given that we only really need this to work for
SJIS, is to hard-wire the logic into path.c --- it's not like
pg_sjis_mblen is either long or likely to change. That's
ugly but would be a lot less invasive and safer to back-patch.

I'm leaning a bit to the second way, mainly because of the
extern-relocation annoyance.

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2024-12-06 22:51:42 Re: Dangling operator family after DROP TYPE
Previous Message Tom Lane 2024-12-06 18:44:24 Re: BUG #18735: Specific multibyte character in psql file path command parameter for Windows