From: | Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com> |
---|---|
To: | Keith Fiske <keith(dot)fiske(at)crunchydata(dot)com>, pgsql-general(at)lists(dot)postgresql(dot)org |
Subject: | Re: client_encoding issue with SQL_ASCII on 8.3 to 10 upgrade |
Date: | 2018-04-16 15:55:06 |
Message-ID: | d12e4b06-756e-5a61-f6d0-97d9cfeca991@aklaver.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On 04/16/2018 08:16 AM, Keith Fiske wrote:
> Running into an issue with helping a client upgrade from 8.3 to 10 (yes,
> I know, please keep the out of support comments to a minimum, thanks :).
>
> The old database was in SQL_ASCII and it needs to stay that way for now
> unfortunately. The dump and restore itself works fine, but we're now
> running into issues with some data returning encoding errors unless we
> specifically set the client_encoding value to SQL_ASCII.
>
> Looking at the 8.3 database, it has the client_encoding value set to
> UTF8 and queries seem to work fine. Is this just a bug in the old 8.3
> not enforcing encoding properly?e
AFAIK, SQL_ASCII basically means no encoding:
https://www.postgresql.org/docs/10/static/multibyte.html
"The SQL_ASCII setting behaves considerably differently from the other
settings. When the server character set is SQL_ASCII, the server
interprets byte values 0-127 according to the ASCII standard, while byte
values 128-255 are taken as uninterpreted characters. No encoding
conversion will be done when the setting is SQL_ASCII. Thus, this
setting is not so much a declaration that a specific encoding is in use,
as a declaration of ignorance about the encoding. In most cases, if you
are working with any non-ASCII data, it is unwise to use the SQL_ASCII
setting because PostgreSQL will be unable to help you by converting or
validating non-ASCII characters."
What client are you working with?
If psql then its behavior has changed between 8.3 and 10:
https://www.postgresql.org/docs/10/static/release-9-1.html#id-1.11.6.121.3
"
Have psql set the client encoding from the operating system locale by
default (Heikki Linnakangas)
This only happens if the PGCLIENTENCODING environment variable is not set.
"
https://www.postgresql.org/docs/10/static/app-psql.html
"If both standard input and standard output are a terminal, then psql
sets the client encoding to “auto”, which will detect the appropriate
client encoding from the locale settings (LC_CTYPE environment variable
on Unix systems). If this doesn't work out as expected, the client
encoding can be overridden using the environment variable PGCLIENTENCODING."
>
> The other thing I noticed on the 10 instance was that, while the LOCALE
> was set to SQL_ASCII, the COLLATE and CTYPE values for the restored
> databases were en_US.UTF-8. Could this be having an affect? Is there any
> way to see what these values were on the old 8.3 database? The
> pg_database catalog does not have these values stored back then.
>
> --
> Keith Fiske
> Senior Database Engineer
> Crunchy Data - http://crunchydata.com
--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2018-04-16 16:09:09 | Re: client_encoding issue with SQL_ASCII on 8.3 to 10 upgrade |
Previous Message | Adrian Klaver | 2018-04-16 15:37:42 | Re: client_encoding issue with SQL_ASCII on 8.3 to 10 upgrade |