Re: Import/Export failure due to UTF-8 error in pgAdmin4 but not in pgAdmin3

From: richard coleman <rcoleman(dot)ascentgl(at)gmail(dot)com>
To: Dave Page <dpage(at)pgadmin(dot)org>
Cc: Nanina Tron <nanina(dot)tron(at)icloud(dot)com>, "pgadmin-support lists(dot)postgresql(dot)org" <pgadmin-support(at)lists(dot)postgresql(dot)org>
Subject: Re: Import/Export failure due to UTF-8 error in pgAdmin4 but not in pgAdmin3
Date: 2019-01-07 19:16:57
Message-ID: CAGA3vBuueYmg5CfHXWBYETpuNS-fTw+UT2+VyBzDR7SPbxa8cA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgadmin-support

Dave,

Thanks for taking the time to respond, but I don't see anywhere that
SQL_ASCII is recommended against doing. Here's the documentation listing
the supported encoding schemas:
https://www.postgresql.org/docs/current/multibyte.html .

The *only* caveats listed for SQL_ASCII are:

> In most cases, if you are working with any non-ASCII data, it is unwise to
> use the SQL_ASCII setting because PostgreSQL will be unable to help you
> by converting or validating non-ASCII characters.

Or, a reminder that postgreSQL can't help with any conversions you might
want to do.

Then there's this:

> PostgreSQL will allow superusers to create databases with SQL_ASCII encoding
> even when LC_CTYPE is not C or POSIX. As noted above, SQL_ASCII does not
> enforce that the data stored in the database has any particular encoding,
> and so this choice poses risks of locale-dependent misbehavior. Using this
> combination of settings is deprecated and may someday be forbidden
> altogether.

A note that you can currently choose incompatible settings, but probably
can't in the future.

And finally there's this bit of advice:

> If the client character set is defined as SQL_ASCII, encoding conversion
> is disabled, regardless of the server's character set. Just as for the
> server, *use of **SQL_ASCII** is unwise unless you are working with
> all-ASCII data*[emphasis mine].

Which is just a reiteration of the first caveat, that if you are using
SQL_ASCII the database won't perform any conversions on your behalf.

That is hardly a recommendation against using that supported encoding
scheme. The fact that the psql command prompt, among others, works with it
without issue, is an indication that the problem lies in pgAdmin4 (and I
would guess the reliance of python on UTF8) than an issue with the database
itself. pgAdmin4 needs to check for and more gracefully handle
*valid* postgreSQL
data that might happen to be not UTF8 compliant.

Until then, I will have to periodically scan and clean for *bad* UTF8 data
to keep pgAdmin4 (and other JDBC dependent code) happy. The legacy
enterprise .Net applications that depend on it prohibit converting it to
UTF8 (or anything else for that matter).

Just my $0.02,

rik.

On Mon, Jan 7, 2019 at 1:27 PM Dave Page <dpage(at)pgadmin(dot)org> wrote:

> Hi
>
> On Mon, Jan 7, 2019 at 11:30 PM richard coleman
> <rcoleman(dot)ascentgl(at)gmail(dot)com> wrote:
> >
> > Dave,
> >
> > I can't speak to Nania's specific issue, but I believe it's a pgAdmin4
> specific problem, at least in so far as SQL_ASCII is concerned. I say this
> because I can usually work with the data just fine from the psql prompt,
> but not through pgAdmin4 (or other postgreSQL GUI's like dBeaver that rely
> on the JDBC connection). .Net/Windows ODBC drivers and psql command
> prompt, no problem (as was pgAdmin3 assuming you don't do too much with it
> beyond select/update/insert). pgAdmin4, SELECT, export, etc. BOOM! At
> least until you cleaned up the offending bytes.
> >
> > Just my $0.02.
>
> I'm afraid the fundamental problem is that you're using PostgreSQL in
> a way that the docs specifically recommend against doing, and you're
> seeing the reason why.
>
> pgAdmin 3 and 4 are completely different. In the import/export utility
> that Nania reported the issue in, pgAdmin doesn't look at the data *at
> all*. It simply executes \copy in psql, which does all the work. All
> pgAdmin does is provide connection info and options to psql, based on
> the selections made in the import/export dialogue, and executes it.
>
> In other areas of pgAdmin, like the query tool, it is possible to see
> similar issues with the same underlying cause, though we've spent a
> significant amount of time trying to work around all the possible edge
> cases.
>
> pgAdmin 3 implemented import/export itself, using underlying libraries
> that were far less strict about encoding rules than Python is. That
> may have been more convenient for this particular issue, but it's a
> lot worse in many others.
>
> As a general thought (and do bear in mind, we've spent significant
> time and resources on these issues in the past), I'd far rather spend
> time on new features and actual bugs, than further time on workarounds
> for things the PostgreSQL docs specifically advise against doing.
>
> --
> Dave Page
> Blog: http://pgsnake.blogspot.com
> Twitter: @pgsnake
>
> EnterpriseDB UK: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

In response to

Responses

Browse pgadmin-support by date

  From Date Subject
Next Message Dave Page 2019-01-08 05:29:01 Re: Import/Export failure due to UTF-8 error in pgAdmin4 but not in pgAdmin3
Previous Message Dave Page 2019-01-07 18:27:04 Re: Import/Export failure due to UTF-8 error in pgAdmin4 but not in pgAdmin3