Quick Links

Re: [rfc] unicode escapes for extended strings

From:	Peter Eisentraut <peter_e(at)gmx(dot)net>
To:	Marko Kreen <markokr(at)gmail(dot)com>
Cc:	Postgres Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [rfc] unicode escapes for extended strings
Date:	2009-09-21 20:36:52
Message-ID:	1253565412.20098.5.camel@vanquo.pezone.net
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Wed, 2009-09-09 at 18:26 +0300, Marko Kreen wrote:
> Unicode escapes for extended strings.
>
> On 4/16/09, Marko Kreen <markokr(at)gmail(dot)com> wrote:
> > Reasons:
> >
> > - More people are familiar with \u escaping, as it's standard
> > in Java/C#/Python, probably more..
> > - U& strings will not work when stdstr=off.
> >
> > Syntax:
> >
> > \uXXXX - 16-bit value
> > \UXXXXXXXX - 32-bit value
> >
> > Additionally, both \u and \U can be used to specify UTF-16 surrogate
> > pairs to encode characters with value > 0xFFFF. This is exact behaviour
> > used by Java/C#/Python. (except that Java does not have \U)
>
> v3 of the patch:
>
> - convert to new reentrant lexer API
> - add lexer targets to avoid fallback to default
> - completely disallow \U\u without proper number of hex values
> - fix logic bug in surrogate pair handling

This looks good to me. I'm implementing the surrogate pair handling for
the U& syntax for consistency. Then I'll apply this.

In response to

Re: [rfc] unicode escapes for extended strings at 2009-09-09 15:26:59 from Marko Kreen

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Dimitri Fontaine	2009-09-21 20:43:02	Re: generic copy options
Previous Message	Tom Lane	2009-09-21 20:33:32	Re: Adding \ev view editor?