Quick Links

Re: Locale/encoding problem/question

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	henka(at)cityweb(dot)co(dot)za
Cc:	"Martijn van Oosterhout" <kleptog(at)svana(dot)org>, pgsql-general(at)postgresql(dot)org
Subject:	Re: Locale/encoding problem/question
Date:	2006-08-04 12:59:51
Message-ID:	5345.1154696391@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

henka(at)cityweb(dot)co(dot)za writes:
>> It should be in the dump file, almost the first line. Locale is of no
>> interest to pg_dump, you'll have to decide how you want it.

> Yes: UTF-8 and the other is LATIN1

Note that this represents what the original server *thought* the
encoding was. But it's not at all impossible that the server thought
the data was LATIN1 when it was really UTF8. (The other way around is
less plausible because the server would have been able to detect
encoding errors.) If you were using clients that treated the data
as UTF8 without paying attention to what the server thought, you'd
not have realized you were mislabeling the data.

But, if you tried to load data marked as LATIN1 into a server using
UTF8, it'd have applied a LATIN1 to UTF8 conversion, and then
everything's hosed.

I'd suggest actually inspecting the data in the dump file: it's not that
hard to tell UTF8 from LATIN1 if you look at the byte sequences.

Or you could just take the file marked LATIN1, edit it to change the
client_encoding setting to say the data is UTF8, and see if you can
load it. If it's not UTF8, 8.1.4 will almost certainly detect a ton of
encoding errors.

regards, tom lane

In response to

Re: Locale/encoding problem/question at 2006-08-04 10:33:08 from henka

Responses

Re: Locale/encoding problem/question at 2006-08-04 20:54:41 from henka

Browse pgsql-general by date

	From	Date	Subject
Next Message	Merlin Moncure	2006-08-04 13:29:50	Re: Best Procedural Language?
Previous Message	Q Beukes	2006-08-04 12:33:24	pg_dump sequence problem