From: | Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> |
---|---|
To: | alvherre(at)commandprompt(dot)com |
Cc: | tgl(at)sss(dot)pgh(dot)pa(dot)us, badalex(at)gmail(dot)com, cb(at)df7cb(dot)de, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: pl/perl and utf-8 in sql_ascii databases |
Date: | 2012-07-12 04:12:24 |
Message-ID: | 20120712.131224.120940995.horiguchi.kyotaro@lab.ntt.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Very sorry for rotten subject. I resent the message with correct subject.
# Our mail server insisted that the message is spam. sigh..
====
Hmm... Sorry for immature patch..
> ... and this story hasn't ended yet, because one of the new tests is
> failing. See here:
>
> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=magpie&dt=2012-07-11%2010%3A00%3A04
>
> The interesting part of the diff is:
...
> SELECT encode(perl_utf_inout(E'ab\xe5\xb1\xb1cd')::bytea, 'escape')
> ! ERROR: character with byte sequence 0xe5 0xb7 0x9d in encoding "UTF8" has no equivalent in encoding "LATIN1"
> ! CONTEXT: PL/Perl function "perl_utf_inout"
>
>
> I am not sure what can we do here other than remove this function and
> query from the test.
I've run the regress only for the environment capable to handle
the character U+5ddd (Japanese character which means river)...
The byte sequences which can be decoded and the result byte
sequences of encoding from a unicode character vary among the
encodings.
The problem itself which is the aim of this thread could be
covered without the additional test. That confirms if
encoding/decoding is done as expected on calling the language
handler. I suppose that testing for the two cases and additional
one case which runs pg_do_encoding_conversion(), say latin1,
would be enough to confirm that encoding/decoding is properly
done, since the concrete conversion scheme is not significant
this case.
So I recommend that we should add the test for latin1 and omit
the test from other than sql_ascii, utf8 and latin1. This might
be archieved by create empty plperl_lc.sql and plperl_lc.out
files for those encodings.
What do you think about that?
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
== My e-mail address has been changed since Apr. 1, 2012.
From | Date | Subject | |
---|---|---|---|
Next Message | Josh Berkus | 2012-07-12 04:18:10 | Re: Synchronous Standalone Master Redoux |
Previous Message | Kyotaro HORIGUCHI | 2012-07-12 04:09:19 | Re: [SPAM] [MessageLimit][lowlimit] Re: pl/perl and utf-8 in sql_ascii databases |