From: | Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com> |
---|---|
To: | Kasia Tuszynska <ktuszynska(at)esri(dot)com> |
Cc: | "pgsql-bugs(at)postgresql(dot)org" <pgsql-bugs(at)postgresql(dot)org> |
Subject: | Re: ERROR: character 0xe3809c of encoding "UTF8" has no equivalent in EUC_JP |
Date: | 2011-03-23 01:58:48 |
Message-ID: | AANLkTikBXZpZhjZccc=jq61-u4FaZE6c_7RLvhJc1B24@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On Wed, Mar 23, 2011 at 08:05, Kasia Tuszynska <ktuszynska(at)esri(dot)com> wrote:
> I was wondering if this was considered a bug, and if so what were the plans
> to fix it: http://archives.postgresql.org/pgsql-bugs/2005-08/msg00211.php
The wave dash issue is not postgres-specific; some other converter just
replace it with '?'. Instead, postgres throws an error.
I guess there is no possibility to support ambiguous character mappings
in the default conversions, but you can define more relaxed conversion
procedures for your purpose.
BTW, we cannot use non-default conversion procedures from SQL commands,
right? If it were allowed, we can use some "relaxed" conversions
on the initial loading, like this:
=# SET character_conversion TO utf8_to_eucjp_relaxed;
=# COPY tbl FROM '/file_with_wave_dashes.utf8.tsv';
=# RESET character_conversion;
Another idea is to allow to create new encoding names and define
the above conversion procs as the default:
=# CREATE ENCODING eucjp_relaxed;
=# CREATE DEFAULT CONVERSION xxx FOR utf8 TO eucjp_relaxed
FROM utf8_to_eucjp_relaxed;
I think overhaul of conversion support is a TODO item.
--
Itagaki Takahiro
From | Date | Subject | |
---|---|---|---|
Next Message | Itagaki Takahiro | 2011-03-23 03:38:26 | Re: ERROR: character 0xe3809c of encoding "UTF8" has no equivalent in EUC_JP |
Previous Message | Tatsuo Ishii | 2011-03-23 01:58:32 | Re: ERROR: character 0xe3809c of encoding "UTF8" has no equivalent in EUC_JP |