Re: Hex characters in COPY input

From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: Melvin Call <melvincall979(at)gmail(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Hex characters in COPY input
Date: 2015-02-27 16:07:43
Message-ID: 54F0964F.4030003@aklaver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 02/27/2015 07:55 AM, Melvin Call wrote:
> On Fri, Feb 27, 2015 at 9:03 AM, Adrian Klaver
> <adrian(dot)klaver(at)aklaver(dot)com> wrote:
>> On 02/27/2015 06:39 AM, Melvin Call wrote:
>>>
>>> On 2/26/15, Vick Khera <vivek(at)khera(dot)org> wrote:
>>>>
>>>> On Thu, Feb 26, 2015 at 9:50 PM, Melvin Call <melvincall979(at)gmail(dot)com>
>>>> wrote:
>>>>
>>>>> Montreal where the e is an accented e. The output ends up in the text
>>>>> file
>>>>> as
>>>>> Montr\xe9al, where the xe9 is a single character. When I try to copy
>>>>> that
>>>>> into
>>>>> my PostgreSQL table, I get an error "ERROR: invalid byte sequence for
>>>>> encoding
>>>>>
>>>>
>>>> Character code E9 is not UTF8. Don't tell Postgres you're importing UTF8
>>>> if
>>>> you're not.
>>>>
>>> Thank you Vic, adding the ENCODING 'latin1' option to the COPY command
>>> worked
>>> perfectly.
>>>
>>> If you don't mind a follow up to your reply, I have tried to understand
>>> the
>>> different character sets and collations, but I guess I still have a lot to
>>> learn. Your suggestion did not even come close to crossing my mind because
>>> the
>>> MySQL table and database are encoded in UTF8. I assume the conversion to
>>> latin1
>>> happened because I was putting the MySQL query output into a locally
>>> stored
>>> text file? Regardless, can you point me to some reading that would have
>>> clued
>>> me in that e9 is not a UTF8 character? Or is the clue the fact that it was
>>> not
>>> preceeded with 0x00?
>>
>>
>> For UTF8 characters see here:
>>
>> http://www.utf8-chartable.de/
>
> Thank you for the link. Bookmarked.
>
>>
>>
>> For the MySQL part, you are going to detail how you got the data out?
>
> This is in preparation of moving away from MySQL. I inherited this MySQL
> database and it is in horrible shape, no referential integrity, no constraints
> other than arbitrarily chosen VARCHAR lengths, no indexes, and inconsistent
> entity and attribute naming. I have to pull the fields out through a query that
> is being redirected to a local file, as opposed to a dump, because I'm having
> to filter out a lot of useless rows (TestCity in a production system!?). I just
> realized I could have put the usable data into similar tables and then used
> mysqldump with the encoding specified, but no need now. I have my extraction
> and import done, so this should be the last I need to touch the MySQL system.

Gotcha, I recently did something similar.

>
>>
>>>
>>> Regards,
>>> Melvin
>>>
>>>
>>
>>
>> --
>> Adrian Klaver
>> adrian(dot)klaver(at)aklaver(dot)com
>
>

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Greg Sabino Mullane 2015-02-27 16:51:10 Re: Performance on DISABLE TRIGGER
Previous Message MattF 2015-02-27 15:59:01 Re: Triggers on foreign Postgres 9.3 tables in Postgres 9.4