Re: Import/Export failure due to UTF-8 error in pgAdmin4 but not in pgAdmin3

From: richard coleman <rcoleman(dot)ascentgl(at)gmail(dot)com>
To: Nanina Tron <nanina(dot)tron(at)icloud(dot)com>
Cc: "pgadmin-support lists(dot)postgresql(dot)org" <pgadmin-support(at)lists(dot)postgresql(dot)org>
Subject: Re: Import/Export failure due to UTF-8 error in pgAdmin4 but not in pgAdmin3
Date: 2019-01-07 15:40:53
Message-ID: CAGA3vBuicw5fm5cv+0Qv-_Q9g+mn1J+sy0=JSLS5xOYb0CfYug@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgadmin-support

Nania,

Welcome to the wonderful world of pgAdmin4. I have been bitten often by
this particular shortcoming in pgAdmin4. :( My issue seems to stems from
the fact that I use ASCII tables as a back end for a .Net windows
application and perfectly valid windows (Word/Excel) characters cause
pgAdmin4 no end of issues.

My solution (with the help of some fine people on the postgres IRC channel)
is to run a couple of functions on my tables/fields to locate and clean the
offending characters out. Of course, if you *need* those characters, then
this won't actually help. Here they are in the advent that they might
prove helpful/adaptable to your situation.

Finds what pgAdmin4 considers *bad* UTF8:
CREATE OR REPLACE FUNCTION live.is_utf8(
text)
RETURNS boolean
LANGUAGE 'sql'

COST 100
VOLATILE
AS $BODY$
select encode(convert_to($1,'SQL_ASCII'),'hex')
~ $r$(?x)
^(?:(?:[0-7][0-9a-f])
|(?:(?:c[2-9a-f]|d[0-9a-f])
|e0[ab][0-9a-f]
|ed[89][0-9a-f]
|(?:(?:e[1-9abcef])
|f0[9ab][0-9a-f]
|f[1-3][89ab][0-9a-f]
|f48[0-9a-f]
)[89ab][0-9a-f]
)[89ab][0-9a-f]
)*$
$r$;
$BODY$;

ALTER FUNCTION live.is_utf8(text)
OWNER TO postgres;

Fixes what *pgAdmin4* considers to be *bad* UTF8:

> CREATE OR REPLACE FUNCTION live.badutf8(
> text)
> RETURNS text
> LANGUAGE 'sql'
> COST 100
> VOLATILE
> AS $BODY$
> select regexp_replace(encode(convert_to($1,'SQL_ASCII'),'hex'),
> $r$(?x)
> (?:(?:[0-7][0-9a-f])
> |(?:(?:c[2-9a-f]|d[0-9a-f])
> |e0[ab][0-9a-f]
> |ed[89][0-9a-f]
> |(?:(?:e[1-9abcef])
> |f0[9ab][0-9a-f]
> |f[1-3][89ab][0-9a-f]
> |f48[0-9a-f]
> )[89ab][0-9a-f]
> )[89ab][0-9a-f]
> )*(..)?
> $r$, '-\1-', 'g')
> $BODY$;
> ALTER FUNCTION live.badutf8(text)
> OWNER TO postgres;

Fixes *bad* UTF8

On Mon, Jan 7, 2019 at 8:40 AM Nanina Tron <nanina(dot)tron(at)icloud(dot)com> wrote:

> Hi,
>
> I am pretty new to PostgreSQL so I might just miss something basic here.
>
> My problem is that, I cannot import or export some of the tables in my db
> with pgAdmin4, as it raises the “ERROR: unvalid byte-sequenz for coding
> »UTF8«: 0xdf 0x67“”. The table was originally created with Excel and
> imported via pgAdmin3. The strange thing is that it can still be imported
> and exported with pgAdmin3 but not with pgAdmin4. The db was created with
> encoding UTF-8, the .csv files where created with encoding UTF-8 and also
> the import/export dialog is set to UTF-8. Queries are also no problem on
> these tables so it seems to me that this could be a client problem.
>
> I am running PostgreSQL 11.1 on a server (I don’t know the OS, maintained
> with pgAdmin4). Locally I am working on a Windows 7 Professional (Service
> Pack 1) 64 Bit-System and pgAdmin4 3.6 & pgAdmin3.
>
> I did not find any hint of the same problem on my Google or archive
> search, so I would be very grateful for any idea what I am doing wrong here.
>
> Best,
>
> Nanina
>

In response to

Responses

Browse pgadmin-support by date

  From Date Subject
Next Message Dave Page 2019-01-07 17:49:03 Re: Import/Export failure due to UTF-8 error in pgAdmin4 but not in pgAdmin3
Previous Message Nanina Tron 2019-01-07 10:34:45 Import/Export failure due to UTF-8 error in pgAdmin4 but not in pgAdmin3