Re: converting curly apostrophes to standard apostrophes

From: CSN <cool_screen_name90001(at)yahoo(dot)com>
To: Michael Fuhr <mike(at)fuhr(dot)org>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: converting curly apostrophes to standard apostrophes
Date: 2005-08-16 01:43:50
Message-ID: 20050816014350.56816.qmail@web52906.mail.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

--- Michael Fuhr <mike(at)fuhr(dot)org> wrote:

> On Mon, Aug 15, 2005 at 01:48:00PM -0700, CSN wrote:
> > db=>select ascii('’');
> > ascii
> > -------
> > 226
> >
> > db=>select id from news where body ilike '%’%';
> > (0 rows)
> >
> > db=>select id from news where body ilike '%' ||
> > chr(226) || '%';
> > db'>
> > db'>^C
> > db=>
>
> What's going on with the last query? The prompt
> change suggests
> that psql is confused with quoting, and the ^C looks
> like you hit
> Control-C to get the regular prompt back. Did you
> ever run this
> query? If it produced no rows then you could widen
> the search.

Hmm, I'm on another computer and I just tried that
last query and it worked without psql thinking it
needed another single quote. Appears the chr code is
146 not 226 (turns out chr(226) is â - why that
doesn't cause problems with iso-8859-1/utf-8 xml and
the single/double quotes and dashes do I don't know).
Anyhow, I ended up doing this:

update news set body=replace(body,chr(146),''''); --
left single quote
update news set body=replace(body,chr(145),''''); --
right single quote
update news set body=replace(body,chr(147),'"'); --
left double quote
update news set body=replace(body,chr(148),'"'); --
right double quote
update news set body=replace(body,chr(150),'-'); -- en
dash
update news set body=replace(body,chr(151),'-'); -- em
dash

and that seems to do the trick. Most places I found
online listed different chars for these codes, but
http://www.webopedia.com/quick_ref/asciicode.asp lists
them. Jeez, I'm so confused with encodings, charsets,
etc. now. :(

Thanks,
CSN

> Example:
>
> SELECT id FROM news WHERE body ~ '[\200-\377]';
>
> You could use the "string from pattern" variant of
> substring() to
> extract characters in a specific range. If you have
> PL/Perl then
> it would be trivial to extract all of and only the
> special characters
> along with their ASCII codes:
>
> CREATE FUNCTION special_chars(text) RETURNS text AS
> '
> return join(" ", map {"$_:" . ord($_)} $_[0] =~
> /[\200-\377]/g);
> ' LANGUAGE plperl IMMUTABLE STRICT;
>
> SELECT id, special_chars(body) FROM news WHERE body
> ~ '[\200-\377]';
>
> --
> Michael Fuhr
>

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Christopher Browne 2005-08-16 01:47:00 Re: do separate databases have any impact each other?
Previous Message Andrew Dunstan 2005-08-16 01:33:54 Re: Testing of MVCC