From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: COPY ENCODING revisited |
Date: | 2011-02-17 18:57:29 |
Message-ID: | AANLkTik7AYu7Zz8yQ4vk5LArqdi1gR4rd=QLOU3Tt5q0@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Feb 16, 2011 at 10:45 PM, Itagaki Takahiro
<itagaki(dot)takahiro(at)gmail(dot)com> wrote:
> COPY ENCODING patch was returned with feedback,
> https://commitfest.postgresql.org/action/patch_view?id=501
> but we still need it for file_fdw. Using client_encoding at runtime
> is reasonable for one-time COPY command, but logically nonsense for
> persistent file_fdw tables.
>
> Base on the latest patch,
> http://archives.postgresql.org/pgsql-hackers/2011-01/msg02903.php
> I added pg_any_to_server() and pg_server_to_any() functions instead of
> exposing FmgrInfo in pg_wchar.h. They are same as pg_client_to_server()
> and pg_server_to_client(), but accept any encoding. They use cached
> conversion procs only if the specified encoding matches the client encoding.
>
> According to Harada's research,
> http://archives.postgresql.org/pgsql-hackers/2011-01/msg02397.php
> non-cached conversions are slower than cached ones. This version provides
> the same performance before when file and client encoding are same,
> but would be a bit slower on other cases. We could improve the performance
> in future versions, for example, caching each used conversion proc in
> pg_do_pg_do_encoding_conversion().
>
> file_fdw will support ENCODING option. Also, if not specified it might
> have to store the client_encoding at CREATE FOREIGN TABLE. Even if we use
> a different client_encoding at SELECT, the encoding at definition is used.
>
> ENCODING 'quoted name' issue is also fixed; it always requires quoted names.
> I think we only accept non-quoted text as identifier names. Unquoted text
> should be treated as "double quoted", but encoding names are not identifiers.
I am not qualified to fully review this patch because I'm not all that
familiar with the encoding stuff, but it looks reasonably sensible on
a quick read-through. I am supportive of making a change in this area
even at this late date, because it seems to me that if we're not going
to change this then we're pretty much giving up on having a usable
file_fdw in 9.1. And since postgresql_fdw isn't in very good shape
either, that would mean we may as well give up on SQL/MED. We might
have to do that anyway, but I don't think we should do it just because
of this issue, if there's a reasonable fix.
I don't think the fact that the performance bites is a reason not to
do this. As you say, that can always be improved in the future.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2011-02-17 18:58:14 | Re: contrib loose ends: 9.0 to 9.1 incompatibilities |
Previous Message | Tom Lane | 2011-02-17 18:53:04 | Re: contrib loose ends: 9.0 to 9.1 incompatibilities |