From: | hernan gonzalez <hgonzalez(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, hernan gonzalez <hgonzalez(at)gmail(dot)com> |
Cc: | pgsql-general(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: psql weird behaviour with charset encodings |
Date: | 2010-05-08 02:30:55 |
Message-ID: | v2s48692c2d1005071930l161b60f0vaf9e8ca1023a9e2@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general pgsql-hackers |
Sorry about a error in my previous example (mixed width and precision).
But the conclusion is the same - it works on bytes:
#include<stdio.h>
main () {
char s[] = "ni\xc3\xb1o"; /* 5 bytes , 4 utf8 chars */
printf("|%*s|\n",6,s); /* this should pad a black */
printf("|%.*s|\n",4,s); /* this should eat a char */
}
[root(at)myserv tmp]# ./a.out | od -t cx1
0000000 | n i 303 261 o | \n | n i 303 261 | \n
7c 20 6e 69 c3 b1 6f 7c 0a 7c 6e 69 c3 b1 7c 0a
Hernán
On Fri, May 7, 2010 at 10:48 PM, <hgonzalez(at)gmail(dot)com> wrote:
>> However, it appears that glibc's printf
> code interprets the parameter as the number of *characters* to print,
> and to determine what's a character it assumes the string is in the
> environment LC_CTYPE's encoding.
>
> Well, I myself have problems to believe that :-)
> This would be nasty... Are you sure?
>
> I couldn reproduce that.
> I made a quick test, passing a utf-8 encoded string
> (5 bytes correspoding to 4 unicode chars: "niño")
> And my glib (same Fedora 12) seems to count bytes,
> as it should.
>
> #include<stdio.h>
> main () {
> char s[] = "ni\xc3\xb1o";
> printf("|%.*s|\n",5,s);
> }
>
> This, compiled with gcc 4.4.3, run with my root locale (utf8)
> did not padded a blank. i.e. it worked as expected.
>
> Hernán
From | Date | Subject | |
---|---|---|---|
Next Message | Mike Christensen | 2010-05-08 04:12:46 | peer-to-peer replication with Postgres |
Previous Message | hgonzalez | 2010-05-08 01:48:53 | Re: psql weird behaviour with charset encodings |
From | Date | Subject | |
---|---|---|---|
Next Message | Marc G. Fournier | 2010-05-08 04:04:34 | Re: beta to release |
Previous Message | hgonzalez | 2010-05-08 01:48:53 | Re: psql weird behaviour with charset encodings |