From: | Alex Hunsaker <badalex(at)gmail(dot)com> |
---|---|
To: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [COMMITTERS] pgsql: Force strings passed to and from plperl to be in UTF8 encoding. |
Date: | 2011-02-12 09:18:36 |
Message-ID: | AANLkTinzzzJCJE=Ac_kZOOv4Hirogd9M2Wjjcecx_Si1@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-committers pgsql-hackers |
On Sun, Feb 6, 2011 at 15:31, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> Force strings passed to and from plperl to be in UTF8 encoding.
>
> String are converted to UTF8 on the way into perl and to the
> database encoding on the way back. This avoids a number of
> observed anomalies, and ensures Perl a consistent view of the
> world.
So I noticed a problem while playing with this in my discussion with
David Wheeler. pg_do_encoding() does nothing when the src encoding ==
the dest encoding. That means on a UTF-8 database we fail make sure
our strings are valid utf8.
An easy way to see this is to embed a null in the middle of a string:
=> create or replace function zerob() returns text as $$ return
"abcd\0efg"; $$ language plperl;
=> SELECT zerob();
abcd
Also It seems bogus to bogus to do any encoding conversion when we are
SQL_ASCII, and its really trivial to fix.
With the attached:
- when we are on a utf8 database make sure to verify our output string
in sv2cstr (we assume database strings coming in are already valid)
- Do no string conversion when we are SQL_ASCII in or out
- add plperl_helpers.h as a dep to plperl.o in our makefile
- remove some redundant calls to pg_verify_mbstr()
- as utf_e2u only as one caller dont pstrdup() instead have the caller
check (saves some cycles and memory)
Attachment | Content-Type | Size |
---|---|---|
plperl_utf8_mbverify.patch | text/x-patch | 4.2 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2011-02-12 13:43:42 | pgsql: Teach ALTER TABLE .. SET DATA TYPE to avoid some table rewrites. |
Previous Message | Tom Lane | 2011-02-12 03:54:00 | pgsql: Clean up installation directory choices for extensions. |
From | Date | Subject | |
---|---|---|---|
Next Message | Ralf Wildenhues | 2011-02-12 10:10:31 | Re: [Mingw-users] mingw64 |
Previous Message | Jan Urbański | 2011-02-12 09:07:09 | Re: pl/python tracebacks |