Quick Links

Re: [GENERAL] Re: [GENERAL] Different encoding for string values and identifier strings? Or (select 'tést' as tést) returns different values for string and identifier...

From:	"Francisco Figueiredo Jr(dot)" <francisco(at)npgsql(dot)org>
To:	Andreas Kretschmer <akretschmer(at)spamfence(dot)net>
Cc:	pgsql-general(at)postgresql(dot)org
Subject:	Re: [GENERAL] Re: [GENERAL] Different encoding for string values and identifier strings? Or (select 'tést' as tést) returns different values for string and identifier...
Date:	2011-03-16 02:37:06
Message-ID:	AANLkTimCMgg=2oTjYw37Rc=WPHZv7MLYsCGg3Zhobo2D@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

Now, I'm using my dev machine.

With the tests I'm doing, I can see the following:

If I use:

select 'seléct' as "seléct";

column name returns ok as expected.

If I do:

select 'seléct' as seléct;

This is the sequence of bytes I receive from postgresql:

byte1 - 115 UTF-8 for s
byte2 - 101 UTF-8 for e
byte3 - 108 UTF-8 for l
byte4 - 227
byte5 - 169
byte6 - 99 UTF-8 for c
byte7 - 116 UTF-8 for t

The problem lies in the byte4.
According to [1], the first byte defines how many bytes will compose
the UTF-8 char. the problem is that 227 encodes to a binary value of
1110 0011 and so, the UTF-8 decoder will think there are 3 bytes in
sequence when actually there are only 2! :( And this seems to be the
root of the problem for me.

For the select value the correct byte is returned:

byte1 - 115 UTF-8 for s
byte2 - 101 UTF-8 for e
byte3 - 108 UTF-8 for l
byte4 - 195
byte5 - 169
byte6 - 99 UTF-8 for c
byte7 - 116 UTF-8 for t

Where 195 is 1100 0011 which gives two bytes in sequence and the
decoder can decode this to the U+00E9 which is the char "é"

Do you think this can be related to my machine? I'm using OSX 10.6.6
and I compiled postgresql 9.0.1 from source code.

Thanks in advance.

[1] - http://en.wikipedia.org/wiki/UTF-8

On Tue, Mar 15, 2011 at 15:52, Francisco Figueiredo Jr.
<francisco(at)npgsql(dot)org> wrote:
> Hmmmmmmmm,
>
> What would change the encoding of the identifiers?
>
> Because on my dev machine which unfortunately isn't with me right now
> I can't get the identifier returned correctly :(
>
> I remember that it returns:
>
> test=*# select 'tést' as tést;
> tst
> ------
> tést
>
> Is there any config I can change at runtime in order to have it
> returned correctly?
>
> Thanks in advance.
>
>
> On Tue, Mar 15, 2011 at 15:45, Andreas Kretschmer
> <akretschmer(at)spamfence(dot)net> wrote:
>> Francisco Figueiredo Jr. <francisco(at)npgsql(dot)org> wrote:
>>
>>>
>>> What happens if you remove the double quotes in the column name identifier?
>>
>> the same:
>>
>> test=*# select 'tést' as tést;
>> tést
>> ------
>> tést
>> (1 Zeile)
>>
>>
>>
>> Andreas
>> --
>> Really, I'm not out to destroy Microsoft. That will just be a completely
>> unintentional side effect. (Linus Torvalds)
>> "If I was god, I would recompile penguin with --enable-fly." (unknown)
>> Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889°
>>
>> --
>> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-general
>>
>
>
>
> --
> Regards,
>
> Francisco Figueiredo Jr.
> Npgsql Lead Developer
> http://www.npgsql.org
> http://fxjr.blogspot.com
> http://twitter.com/franciscojunior
>

--
Regards,

Francisco Figueiredo Jr.
Npgsql Lead Developer
http://www.npgsql.org
http://fxjr.blogspot.com
http://twitter.com/franciscojunior

In response to

Re: [GENERAL] Re: [GENERAL] Different encoding for string values and identifier strings? Or (select 'tést' as tést) returns different values for string and identifier... at 2011-03-15 18:52:56 from Francisco Figueiredo Jr.

Responses

Re: [GENERAL] Re: [GENERAL] Different encoding for string values and identifier strings? Or (select 'tést' as tést) returns different values for string and identifier... at 2011-03-17 22:26:38 from Francisco Figueiredo Jr.

Browse pgsql-general by date

	From	Date	Subject
Next Message	tushar nehete	2011-03-16 05:25:23	how to use savepoint and rollback in function
Previous Message	Bill Thoen	2011-03-16 00:36:40	Re: Partitioned Database and Choosing Subtables