From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | "Hu, Patricia" <Patricia(dot)Hu(at)finra(dot)org> |
Cc: | "pgsql general (pgsql-general(at)postgresql(dot)org)" <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: loading file with en dash character into postgres 9.6.1 database |
Date: | 2017-07-11 20:15:34 |
Message-ID: | 7517.1499804134@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
"Hu, Patricia" <Patricia(dot)Hu(at)finra(dot)org> writes:
> The server and client encoding are both set to UTF8, and according to this http://www.fileformat.info/info/unicode/char/2013/index.htm en dash is a valid UTF8 character, but when running a script with insert statement with en dash character in it, I got the error below.
> psql:activity_type.lst:379: ERROR: invalid byte sequence for encoding "UTF8": 0x96
Well, that certainly isn't valid UTF8, so your script file isn't in UTF8.
> If I set client_encoding to WIN1252, the same file will be run ok
0x96 does seem to be an en-dash in WIN1252, so this is probably the
appropriate fix. Testing here says that PG will correctly convert
0x96 in WIN1252 to an en-dash (U+2013) in UTF8, so I think you are
getting the right thing inserted.
> but afterwards the en dash character showed up as "û", instead of the en dash character "-"
This indicates that your terminal program does *not* think its encoding
is WIN1252. Having loaded that script file, you need to revert
client_encoding to whatever your terminal program is using, or non-ASCII
characters are going to be displayed wrong.
A bit of poking around suggests that your terminal may be operating
with code page 437 or similar, as 0x96 is "û" in that encoding ---
according to Wikipedia, at least:
https://en.wikipedia.org/wiki/Code_page_437
I don't think Postgres supports that as a client_encoding setting,
so one way or another you're going to need to switch the terminal
program's character set setting.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Steve Litt | 2017-07-12 04:51:36 | Please say it isn't so |
Previous Message | Hu, Patricia | 2017-07-11 19:36:09 | loading file with en dash character into postgres 9.6.1 database |