From: | Derrick Rice <derrick(dot)rice(at)gmail(dot)com> |
---|---|
To: | BRUSSER Michael <Michael(dot)BRUSSER(at)3ds(dot)com> |
Cc: | "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: invalid byte sequence for encoding "UTF8" |
Date: | 2011-06-02 21:16:52 |
Message-ID: | BANLkTin8OFMfAmd704OpU7wXnx81qwz4vA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
That specific character sequence is a result of Unicode implementations
prior to 6.0 mixing with later implementations. See here:
http://en.wikipedia.org/wiki/Specials_%28Unicode_block%29#Replacement_character
You could replace that sequence with the correct 0xFFFD sequence with `sed`
for example (if using a plaintext dump format).
On Thu, Jun 2, 2011 at 4:17 PM, BRUSSER Michael <Michael(dot)BRUSSER(at)3ds(dot)com>wrote:
> We upgrading some old database (7.3.10 to 8.4.4). This involves
> running pg_dump on the old db
>
> and loading the datafile to the new db. If this matters we do not use
> pg_restore, the dump file is just sourced with psql,
>
> and this is where I ran into problem:
>
>
>
> psql: .../postgresql_archive.src/... ERROR: invalid byte sequence for
> encoding "UTF8": 0xedbebf
>
> HINT: This error can also happen if the byte sequence does not match the
> encoding
>
> expected by the server, which is controlled by "client_encoding".
>
>
>
> The server and client encoding are both Unicode. I think we may have some
> copy/paste MS-Word markup
>
> and possibly other odd things on the old database. All this junk is found
> on the ‘text’ fields.
>
>
>
> I found a number of related postings, but did not see a good solution.
> Some folks suggested cleaning the datafile prior to loading,
>
> while someone else did essentially the same thing on the database before
> dumping it.
>
> I am looking for advice, hopefully the “best technique” if there is one,
> any suggestion is appreciated.
>
>
>
> Thanks,
>
> Michael.
>
>
>
> This email and any attachments are intended solely for the use of the
> individual or entity to whom it is addressed and may be confidential and/or
> privileged.
>
> If you are not one of the named recipients or have received this email in
> error,
>
> (i) you should not read, disclose, or copy it,
>
> (ii) please notify sender of your receipt by reply email and delete this
> email and all attachments,
>
> (iii) Dassault Systemes does not accept or assume any liability or
> responsibility for any use of or reliance on this email.
>
> For other languages, go to http://www.3ds.com/terms/email-disclaimer
>
From | Date | Subject | |
---|---|---|---|
Next Message | David Johnston | 2011-06-03 00:26:48 | Hidden Risk w/ UPDATE Cascade and Trigger-Based Validation |
Previous Message | BRUSSER Michael | 2011-06-02 20:17:59 | invalid byte sequence for encoding "UTF8" |