From: | Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com> |
---|---|
To: | Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Add ENCODING option to COPY |
Date: | 2011-01-25 15:24:26 |
Message-ID: | AANLkTi=eAtrf06WLCRTyM=KZsL41R=UoVT4QDECc7G+V@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
2011/1/25 Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>:
> On Sat, Jan 15, 2011 at 02:25, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com> wrote:
>> The patch overrides client_encoding by the added ENCODING option, and
>> restores it as soon as copy is done.
>
> We cannot do that because error messages should be encoded in the original
> encoding even during COPY commands with encoding option. Error messages
> could contain non-ASCII characters if lc_messages is set.
Agreed.
>> I see some complaints ask to use
>> pg_do_encoding_conversion() instead of
>> pg_client_to_server/server_to_client(), but the former will surely add
>> slight overhead per reading line
>
> If we want to reduce the overhead, we should cache the conversion procedure
> in CopyState. How about adding something like "FmgrInfo file_to_server_covv"
> into it?
I looked down to the code and found that we cannot pass FmgrInfo * to
any functions defined in pg_wchar.h, since the header file is shared
in libpq, too.
For the record, I also tried pg_do_encoding_conversion() instead of
pg_client_to_server/server_to_client(), and the simple benchmark shows
it is too slow.
with 3000000 lines with 3 columns (~22MB tsv) COPY FROM
*utf8 -> utf8 (no conversion)
13428.233ms
13322.832ms
15661.093ms
*euc_jp -> utf8 (client_encoding)
17527.470ms
16457.452ms
16522.337ms
*euc_jp -> utf8 (pg_do_encoding_conversion)
20550.983ms
21425.313ms
20774.323ms
I'll check the code more if we have better alternatives.
Regards,
--
Hitoshi Harada
From | Date | Subject | |
---|---|---|---|
Next Message | David Fetter | 2011-01-25 15:27:29 | Re: Extensions support for pg_dump, patch v27 |
Previous Message | Dimitri Fontaine | 2011-01-25 15:23:41 | Re: Extensions support for pg_dump, patch v27 |