Quick Links

Re: Unicode is not UTF-8. was :psqlODBC-Driver Test / text

From:	Bart Samwel <bart(at)samwel(dot)tk>
To:	Marc Herbert <Marc(dot)Herbert(at)continuent(dot)com>
Cc:	pgsql-odbc(at)postgresql(dot)org
Subject:	Re: Unicode is not UTF-8. was :psqlODBC-Driver Test / text
Date:	2006-04-03 09:03:40
Message-ID:	4430E4EC.2010208@samwel.tk
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-odbc

Marc Herbert wrote:
> Bart Samwel <bart(at)samwel(dot)tk> writes:
> wchar_t is not defined as 16-bits, but as "wide enough to hold any
> character of the platform". For instance if the platform uses UCS-4,
> then wchar_t is 32 bits wide.
>
> (UTF-16 wchar_t violates this)

Ahhh, this explains a lot. The same assumption used to be true for char
until they came up with UTF-8 char. And they couldn't just upgrade char
because too much code assumed that char was one byte. Then platforms
started to use UCS-2 wchar_t, then upgraded those to UTF-16 because they
couldn't just upgrade wchar_t because too much code assumed that wchar_t
was two bytes. Same pattern. Time to introduce wwchar_t_t. :-)

> I don't clearly see how you want to use a 8-bit NULL to terminate a
> (wider) wchar_t array... ?

This was a backreference to a situation mentioned earlier in the
discussion, where wchar_t buffers couldn't be "tunneled through" a layer
that used char*, as the wider wchar_t characters may contain NUL bytes.

Cheers,
Bart

In response to

Re: Unicode is not UTF-8. was :psqlODBC-Driver Test / text at 2006-04-03 08:55:30 from Marc Herbert

Browse pgsql-odbc by date

	From	Date	Subject
Next Message	Johann Zuschlag	2006-04-03 09:17:03	Re: Unicode is not UTF-8. was :psqlODBC-Driver Test / text
Previous Message	Marc Herbert	2006-04-03 08:55:30	Re: Unicode is not UTF-8. was :psqlODBC-Driver Test / text