From: | Thomas Tignor <tptignor(at)yahoo(dot)com> |
---|---|
To: | Vijaykumar Jain <vjain(at)opentable(dot)com>, Brad Nicholson <bradn(at)ca(dot)ibm(dot)com> |
Cc: | "pgsql-general(at)lists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org> |
Subject: | Re: [External] postgres 9.5 DB corruption: invalid byte sequence for encoding "UTF8" |
Date: | 2019-03-26 00:25:49 |
Message-ID: | 458346306.10942574.1553559949857@mail.yahoo.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hi Brad,Thanks for writing. As I mentioned to Vijay, the "source" is a JVM using the postgres v42.0.0 JDBC driver. I do not believe we have any explicit encoding set, and so I expect the client encoding is SQL_ASCII. The DB is most definitely UTF8. Our log shows no issue with the input data we've discovered (at the time that it's logged.) If the data is somehow corrupted before inserting, won't the server encoding kick in and generate an error? We can certainly test that.
Tom :-)
On Monday, March 25, 2019, 3:56:04 PM EDT, Brad Nicholson <bradn(at)ca(dot)ibm(dot)com> wrote:
Vijaykumar Jain <vjain(at)opentable(dot)com> wrote on 03/25/2019 03:07:19 PM:
> but why do you think this as db corruption and not just a bad input?
> INVALID URI REMOVED
> u=https-3A__github.com_postgres_postgres_blob_master_src_pl_plperl_expected_plperl-5Flc-5F1.out&d=DwIFaQ&c=jf_iaSHvJObTbx-
> siA1ZOg&r=BX8eA7xgfVJIpaY_30xSZQ&m=7u71qfQylE2M0dQlbUBn399O53IK1HQHm-
> Unxl9LUzw&s=K6nXHvrx3aX4riGMLnucLoRa76QNC0_TOS5R4AziTVM&e=
This looked interesting to me in the settings below:
> client_encoding | SQL_ASCII | client
Unless you have set this explicitly, it will use the default encoding for the database. If it hasn't been explicitly set, then the source database (assuming that that output was from the source) is SQL_ASCII.
Double check the database encoding for the source database and target database. I'm wondering if you have SQL_ASCII for the source, and UTF8 for the target. If that is the case, you can take invalid UTF8 characters into the source, and they will fail to replicate to the target. That's not a Postgres problem, but an encoding mismatch
Brad
From | Date | Subject | |
---|---|---|---|
Next Message | Thomas Kellerer | 2019-03-26 07:20:54 | Re: Forks of pgadmin3? |
Previous Message | Rob Sargent | 2019-03-26 00:10:09 | stale WAL files? |