Re: Chars problem restoring to ps 8.4 (utf8) a dumped db from ps 8.1 (latin9)

From: Martín Marqués <martin(dot)marques(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
Cc: Bianchi Quota Leonardo <leonardo(dot)bianchiquota(at)insiel(dot)it>, "'pgsql-general(at)postgresql(dot)org'" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Chars problem restoring to ps 8.4 (utf8) a dumped db from ps 8.1 (latin9)
Date: 2015-08-13 11:45:35
Message-ID: 55CC835F.20300@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

El 12/08/15 a las 11:12, Tom Lane escribió:
> Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com> writes:
>> On 08/12/2015 06:46 AM, Bianchi Quota Leonardo wrote:
>> Hi, I'm trying to move a db from postgres 8.1 encoded LATIN9 from a
>> debian 4.0 box to postgres 8.4 encoded UTF8 on a rh6.6 (the whole job
>> is to dismiss the old server, migrate and upgrade bugzilla application)
>
> FYI, 8.4 is no longer community supported. The oldest supported version
> is 9.0 and its support will in September. See here for more details:
>
> http://www.postgresql.org/support/versioning/

I think you should try moving even further ahead from 9.0 or 9.1, as to
avoid the trouble of having to plan a new upgrade any time soon.

>>> I "SOLVED" it doing this way but don't know what I did and I don't know which consequences would have in future, then I need to know if it's ok...
>>>
>>> Starting on BOX1
>>> $pg_dump --no-privileges --no-owner -h localhost -U bugs -f DB.sql (dump in latin9)
>>>
>>> $vi DB.sql and changed the first string with the last.
>>>> SET client_encoding = 'LATIN9';
>>> <SET client_encoding = 'utf8';

client_encoding tells the postgres server which encoding the data he
will be receiving is in.

If the data was correctly inserted in the old database with LATIN9, and
you didn't change the client_encoding, then pg_dump will output data
with LATIN9 encoding.

> It does not seem likely to me that this would work at all. You're taking
> a dump file that is full of LATIN9 data and simply asserting that it's
> UTF8 data. That doesn't make it so. If it seemed to work, maybe that's
> because your editor changed the encoding? Not to be relied on, for sure.

Well, IIRC a LATIN9 encoding char which is interpreted as UTF8 will get
inserted with no error on a UTF8 server (although the final data will be
bogus).

IMO Leonardo is confused with the meaning of client_encoding, and should
maybe take a look here before continuing:

http://www.postgresql.org/docs/9.4/static/multibyte.html

And while reading that, they can switch to 9.4. ;)

Regadrs,

--
Martín Marqués http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Jony Cohen 2015-08-13 12:06:13 Re: repmgr won't update witness after failover
Previous Message Aviel Buskila 2015-08-13 10:29:45 repmgr won't update witness after failover