From: | Koichi Suzuki <koichi(dot)dbms(at)gmail(dot)com> |
---|---|
To: | Tatsuo Ishii <ishii(at)postgresql(dot)org> |
Cc: | tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #18735: Specific multibyte character in psql file path command parameter for Windows |
Date: | 2024-12-06 05:26:34 |
Message-ID: | CABEZHFurokPoY+Vs0zhSf7J5ahZV1p7naOPiH5_3C-ubPdDWgA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
2024年12月6日(金) 14:21 Tatsuo Ishii <ishii(at)postgresql(dot)org>:
> > I don't believe Shift-JIS uses '/' as part of multibyte characters,
>
> Correct.
>
> > so it should be sufficient to consider '\'.
>
> Agreed.
>
> > BTW, according to wikipedia[1], backslash is not even part of the
> > Shift-JIS character set:
> >
> > The single-byte characters 0x00 to 0x7F match the ASCII encoding,
> > except for a yen sign (U+00A5) at 0x5C and an overline (U+203E) at
> > 0x7E in place of the ASCII character set's backslash and tilde
> > respectively (these deviations from ASCII align with JIS X
> > 0201). The single-byte characters from 0xA1 to 0xDF map to the
> > half-width katakana characters found in JIS X 0201.
> >
> > For double-byte characters, the first byte is always in the range
> > 0x81 to 0x9F or the range 0xE0 to 0xEF (these ranges are
> > unassigned in JIS X 0201). If the first byte is odd, the second
> > byte must be in the range 0x40 to 0x9E (but cannot be 0x7F); if
> > the first byte is even, the second byte must in the range 0x9F to
> > 0xFC.
> >
> > This might mean that it'd be okay to just skip the backslash-to-slash
> > conversion loops altogether if we think the encoding is Shift-JIS.
>
> I suggest to not do so because majority of Shift-JIS users treat 0x5C
> as a backslash. They understand that a 0x5C means a backslash in
> Shift-JIS files if the files are for programming (source code) or for
> the technical documentations and so on.
>
Better way is to treat 'backslash' byte value in the latter byte of
SJIS-encoded character as is, not treat this byte as escape character.
I'm not sure if we can fix src/fe_utils/psqlscan.l and/or
src/bin/psql/psqlscanslash.l for this. Needs some more investigqation.
> Best reagards,
> --
> Tatsuo Ishii
> SRA OSS K.K.
> English: http://www.sraoss.co.jp/index_en/
> Japanese:http://www.sraoss.co.jp
All the best and thanks to all the kind inputs.
---
Koichi Suzuki
https://www.linkedin.com/in/koichidbms
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2024-12-06 05:44:10 | Re: BUG #18735: Specific multibyte character in psql file path command parameter for Windows |
Previous Message | Tatsuo Ishii | 2024-12-06 05:21:30 | Re: BUG #18735: Specific multibyte character in psql file path command parameter for Windows |