Re: Need magic to clean strings from unconvertible UTF8

From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Andreas <maps(dot)on(at)gmx(dot)net>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Need magic to clean strings from unconvertible UTF8
Date: 2010-11-08 11:51:10
Message-ID: m2aalk6nkh.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Andreas <maps(dot)on(at)gmx(dot)net> writes:
> I can find the problematic rows.
> How could I delete every char in a string that can't be converted to
> WIN1252?

http://tapoueh.org/articles/blog/_Getting_out_of_SQL_ASCII,_part_1.html
http://tapoueh.org/articles/blog/_Getting_out_of_SQL_ASCII,_part_2.html

That's using an hand-crafted translate expression, you could also use
the recode library that does a pretty good job. Maybe the easiest way
here would be using some plpythonu procedure using librecode?

http://packages.debian.org/sid/python-bibtex

Well or the same in plperl… or even easier, process the source files
before importing them?

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Vick Khera 2010-11-08 12:21:17 Re: migrate from 8.1 to 9.0
Previous Message Cédric Villemain 2010-11-08 10:24:55 Re: migrate from 8.1 to 9.0