Re: [SPAM]-D] How to find broken UTF-8 characters ?

From: Andreas <maps(dot)on(at)gmx(dot)net>
To: silly sad <sad(at)bankir(dot)ru>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: [SPAM]-D] How to find broken UTF-8 characters ?
Date: 2010-04-26 12:41:36
Message-ID: 4BD58A00.7030905@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Am 26.04.2010 12:12, schrieb silly sad:
> On 04/26/10 04:12, Andreas wrote:
>
> looks like a complete offtopic
Not anymore. The bad signs are in the DB now.

I'd need some command that filters somehow for inconvertible
(Unicode-->local charset) data.
How can I find those Unicode characters that allready sneaked in?

Actually there shouldn't be anything within the tables that NEED to be
coded in Unicode.

something like
SELECT * FROM tab_1 WHERE field_x <> ConvertToLocal(field_x)
might be a good start.

>> How can I get rid of them?
> iconv -c
AFAIK iconv would translate on file system level but I would think that
messed up a allready messed up Excel workmap even further.
I'd be glad to handle csv, too.

> BUT
> u should not have those characters at all
> if one is occured it most probably an error

Sure, but those files hit me over a chain of people who consider it ok
to convert data over numerus file formats, cut, edit, save as X, send
per mail .... then hit me and I am the one to clean up.

> AND
> u should get rid of this error itself -- not of its consequences.
Like quitting the job and grow flowers instead?
I'll consider this. ;)

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message silly sad 2010-04-26 12:55:12 Re: [SPAM]-D] Re: [SPAM]-D] How to find broken UTF-8 characters ?
Previous Message silly sad 2010-04-26 10:12:26 Re: [SPAM]-D] How to find broken UTF-8 characters ?