From: | Harald Fuchs <hari(dot)fuchs(at)gmail(dot)com> |
---|---|
To: | pgsql-sql(at)postgresql(dot)org |
Subject: | Re: regexp_replace and UTF8 |
Date: | 2009-01-30 15:47:04 |
Message-ID: | pufxj0g5jb.fsf@srv.protecting.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-sql |
In article <87ljstm4eq(dot)fsf(at)oxford(dot)xeocode(dot)com>,
Gregory Stark <stark(at)enterprisedb(dot)com> writes:
> "Bart Degryse" <Bart(dot)Degryse(at)indicator(dot)be> writes:
>> Hi,
>> I have a text field with data like this: 'de patiënt niet'
>> Can anyone help me fix this or point me to a better approach.
>> By the way, changing the way data is put into the field is
>> unfortunately not an option.
> You could use a plperl function to use one of the many html parsing perl
> modules?
Yes, either plperl or some external HTML tool.
>> Basically what I need to do (I think) is
>> - get rid of the &, # and ;
>> - convert the number to hex
>> - make a UTF8 from that (thus: \xEB)
>> - convert that to SQL_ASCII
You know that SQL_ASCII is a misnomer for "no encoding at all, and I
don't care"? I'd use UTF8 or (if you stay in Western Europe) Latin9.
From | Date | Subject | |
---|---|---|---|
Next Message | Craig Ringer | 2009-01-30 17:00:50 | Re: dynamic OUT parameters? |
Previous Message | Gregory Stark | 2009-01-30 11:14:53 | Re: regexp_replace and UTF8 |