Quick Links

Re: Bug in UTF8-Validation Code?

From:	Peter Eisentraut <peter_e(at)gmx(dot)net>
To:	pgsql-hackers(at)postgresql(dot)org
Cc:	Michael Paesold <mpaesold(at)gmx(dot)at>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Albe Laurenz <all(at)adv(dot)magwien(dot)gv(dot)at>, Mario Weilguni EXTERN <mweilguni(at)sime(dot)com>
Subject:	Re: Bug in UTF8-Validation Code?
Date:	2007-03-14 09:05:31
Message-ID:	200703141005.33119.peter_e@gmx.net
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Am Mittwoch, 14. März 2007 08:01 schrieb Michael Paesold:
> Is there anything in the SQL spec that asks for such a behaviour? I guess
> not.

I think that the octal escapes are a holdover from the single-byte days where
they were simply a way to enter characters that are difficult to find on a
keyboard. In today's multi-encoding world, it would make more sense if there
were an escape sequence for a *codepoint* which is then converted to the
actual encoding (if possible and valid) in the server. The meaning of
codepoint is, however, character set dependent as well.

The SQL standard supports escape sequences for Unicode codepoints, which I
think would be a very useful feature (try entering a UTF-8 character
bytewise ...), but it's a bit weird to implement and it's not clear how to
handle character sets other than Unicode.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

In response to

Re: Bug in UTF8-Validation Code? at 2007-03-14 07:01:53 from Michael Paesold

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Zeugswetter Andreas ADI SD	2007-03-14 09:22:17	Re: Synchronized Scan update
Previous Message	tomas	2007-03-14 07:58:14	Re: My honours project - databases using dynamically attached entity-properties