| From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
|---|---|
| To: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
| Cc: | Ralf Schuchardt <rasc(at)gmx(dot)de>, Marco Lechner <mlechner(at)bfs(dot)de>, pgsql-general(at)lists(dot)postgresql(dot)org |
| Subject: | Re: support for DIN SPEC 91379 encoding |
| Date: | 2022-03-27 18:06:25 |
| Message-ID: | 386726.1648404385@sss.pgh.pa.us |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-general |
Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> writes:
> On 2022-Mar-27, Ralf Schuchardt wrote:
>> linked here https://www.xoev.de/downloads-2316#StringLatin it is said,
>> that the spec is a strict subset of unicode (E.1.6), and it is also
>> mentioned in E.1.4, that in UTF-8 all unicode characters can be
>> encoded. Therefore UTF-8 can be used to encode all DIN SPEC 91379
>> characters.
> So the remaining question is whether DIN SPEC 91379 requires an
> implementation to support character U+0000. If it does, then PostgreSQL
> is not conformant, because that character is the only one in Unicode
> that we don't support. If U+0000 is not required, then PostgreSQL is
> okay.
Hmm ... UTF8 as defined in RFC3629/STD63 [1] does not allow "all unicode
characters to be encoded". It disallows surrogate pairs (U+D800--U+DFFF)
and code points above U+10FFFF. We follow that spec, so depending on what
DIN 91379 *actually* says, we might have additional reasons not to be in
compliance. I don't read German unfortunately.
regards, tom lane
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Bzm@g | 2022-03-27 18:08:05 | Re: support for DIN SPEC 91379 encoding |
| Previous Message | Alvaro Herrera | 2022-03-27 17:47:23 | Re: support for DIN SPEC 91379 encoding |