| From: | sunpeng <bluevaley(at)gmail(dot)com> | 
|---|---|
| To: | Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at> | 
| Cc: | Kevin Grittner <kgrittn(at)ymail(dot)com>, PostgreSQL general <pgsql-general(at)postgresql(dot)org> | 
| Subject: | Re: Migration error " invalid byte sequence for encoding "UTF8": 0xff " from mysql 5.5 to postgresql 9.1 | 
| Date: | 2014-07-04 09:12:46 | 
| Message-ID: | CAOYKhLpgtsNmTO=PRfW47VKL4joJJE-oFDGvHau9LeNpCFDsVg@mail.gmail.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-general | 
Thank you, friend, I use  --hex-blob :
mysqldump -v -nt --complete-insert=TRUE --compatible=postgresql
--default-character-set=utf8 --skip-add-locks --compact --no-create-info
--skip-quote-names --hex-blob -uroot -p test videorecresult >dbdata.sql
to dump mysql data.
And replace blob data "0x...." into "E'\\xx....'" to load data into
postgresql.
On Fri, Jul 4, 2014 at 3:27 PM, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
wrote:
> sunpeng wrote:
> >>> load data to postgresql in cmd(encoding is GBK) is WIN8:
> >>>
> >>> psql -h localhost  -d test -U postgres <  dbdata.sql
> >>>
> >>> I got the error:
> >>> ERROR:  invalid byte sequence for encoding "UTF8": 0xff
>
> >> If the encoding is GBK then you will get errors (or incorrect
> >> characters) if it is read as UTF8.  Try setting the environment
> >> variable PGCLIENTENCODING.
> >>
> >> http://www.postgresql.org/docs/9.1/static/app-psql.html
>
> > I‘v changed cmd (in win8) to encoding utf8 through chcp 65001, but error
> still occurs.
> > And i use the following cmd to dump mysql data:
> > mysql> select Picture from personpicture where id =
> 'F2931306D1EE44ca82394CD3BC2404D4'  into outfile
> > "d:\\1.txt" ;
> > I got the ansi file, and use Ultraedit to see first 16 bytes:
> > FF D8 FF E0 5C 30 10 4A 46 49 46 5C 30 01 01 5C
> >
> > It's different from mysql workbench to see:
> > FF D8 FF E0 00 10 4a 46 49 46 00 01 01 00 00 01
>
> Changing the terminal code page won't do anything, it's probably the data
> that are in a different encoding.
>
> I don't know enough about MySQL to know which encoding it uses when
> dumping data,
> but the man page of "mysqldump" tells me:
>
>   --set-charset
>   Add SET NAMES default_character_set to the output. This option is
> enabled by default.
>
> So is there a SET NAMES command in the dump? If yes, what is the argument?
>
> You will have to tell PostgreSQL the encoding of the data.
> As Kevin pointed out, you can do that by setting the environment variable
> PGCLIENT ENCODING to the correct value.  Then PostgreSQL will convert the
> data automatically.
>
> Yours,
> Laurenz Albe
>
| From | Date | Subject | |
|---|---|---|---|
| Next Message | John R Pierce | 2014-07-04 09:21:01 | Re: Migration error " invalid byte sequence for encoding "UTF8": 0xff " from mysql 5.5 to postgresql 9.1 | 
| Previous Message | Albe Laurenz | 2014-07-04 07:27:06 | Re: Migration error " invalid byte sequence for encoding "UTF8": 0xff " from mysql 5.5 to postgresql 9.1 |