Re: Fwd: Re: Re: Postgresql 9.4.4 - ERROR: invalid byte sequence for encoding "UTF8": 0x92

From: Prasanth Reddy <dbadmin(at)nqadmin(dot)com>
To: pgsql-jdbc(at)postgresql(dot)org, John R Pierce <pierce(at)hogranch(dot)com>
Subject: Re: Fwd: Re: Re: Postgresql 9.4.4 - ERROR: invalid byte sequence for encoding "UTF8": 0x92
Date: 2015-08-09 18:44:51
Message-ID: 55C79FA3.1020103@nqadmin.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-jdbc

I am able to restore the same dump in 9.3 and it works fine. So seems like 9.4 is much more stringent than all prior versions.

Thanks,
Prasanth

On 08/08/2015 05:34 PM, Prasanth Reddy wrote:
> 1. If postgresql always does this I am not sure how it is working in 9.1.
>
> 2. I am able to view the content in pgadmin, why would the server not complain when viewing from pgadmin. Not sure what the client encoding would be when using pgadmin.
>
> 3. Thought SQL_ASCII means the text is stored as ASCII format, may be I am way off base on this. Is there a way to check the notes column for invalid UTF8 characters and remove them?
>
> 4. Based on the error there is invalid data in database so the database should have allowed this invalid data to come in some how right?
>
> Any suggestions as to how to get this working in 9.4? Right now the database is in 9.1 and working fine so may be there is a way to fix these issues before doing a dump for 9.4?
>
> Thanks,
> Prasanth
>
> On 8/8/2015 2:40 PM, Prasanth Reddy wrote:
> > 1. Is the content vetting new in 9.4?
>
> no, PostgreSQL has always been strict about content types.
>
> > 2. Is there an option to ask the driver not to vet the content? Some times when the users copy notes from a word document there will be some characters invalid in UTF8. Most of these should only be in
> > notes columns.
>
> the driver isn't doing this, its happening in the postgres
> database. you should fix invalid input at the source application
> before sending to the database.
>
> > 3. Is there a way to check the content and fix it/remove invalid characters?
>
> define 'invalid character', then fix it. this SQL_ASCII field that
> contains 0x92, what character code is that supposed to represent ?
>
> > 4. Is there an option to force the driver to not update database with invalid characters?
>
> the database (NOT the driver) does that already. if you try and insert
> invalid data, you get an error and have to rollback the transaction.
>
>
> --
> john r pierce, recycling bits in santa cruz
>
> On 08/08/2015 04:40 PM, Prasanth Reddy wrote:
>>
>>
>>
>> -------- Forwarded Message --------
>> Subject: Re: Re: Postgresql 9.4.4 - ERROR: invalid byte sequence for encoding "UTF8": 0x92
>> Date: Sat, 08 Aug 2015 16:39:17 -0500
>> From: Prasanth Reddy <dbadmin(at)nqadmin(dot)com>
>> Reply-To: dbadmin(at)nqadmin(dot)com
>> To: Bosco Rama <postgres(at)boscorama(dot)com>
>>
>>
>>
>> 1. Is the content vetting new in 9.4?
>> 2. Is there an option to ask the driver not to vet the content? Some times when the users copy notes from a word document there will be some characters invalid in UTF8. Most of these should only be in
>> notes columns.
>> 3. Is there a way to check the content and fix it/remove invalid characters?
>> 4. Is there an option to force the driver to not update database with invalid characters?
>>
>> Appreciate your help.
>>
>> Thanks,
>> Prasanth
>>
>> ------------------Original Message--------------------------
>>
>> On 08/08/15 10:49, Prasanth Reddy wrote:
>> >
>> [snip]
>> > 12:42:02.505 (1) <=BE ParameterStatus(client_encoding = UTF8)
>> [snip]
>> > 12:42:02.506 (1) <=BE ParameterStatus(server_encoding = SQL_ASCII)
>>
>> There's your problem. The client is expecting UTF8 while the server
>> is storing un-vetted SQL_ASCII. You will need to change one of the
>> components' choice of encoding and/or enforce content conformance in the
>> data. Another choice, in cases of known and expected non-compliance, is
>> to use bytea casting to bypass the automated content vetting. But your
>> choice in this case is going to be driven by the ease with which you can
>> alter your application vs the server-side changes that would be needed.
>>
>> HTH,
>> Bosco.
>>
>>
>>
>

In response to

Responses

Browse pgsql-jdbc by date

  From Date Subject
Next Message Gabriel E. Sánchez Martínez 2015-08-10 02:08:03 Bad value for type int
Previous Message Mark Rotteveel 2015-08-09 09:49:45 Re: Re: Fwd: Re: Re: Postgresql 9.4.4 - ERROR: invalid byte sequence for encoding "UTF8": 0x92