| From: | CSN <cool_screen_name90001(at)yahoo(dot)com> |
|---|---|
| To: | Michael Fuhr <mike(at)fuhr(dot)org> |
| Cc: | pgsql-general(at)postgresql(dot)org |
| Subject: | Re: converting curly apostrophes to standard apostrophes |
| Date: | 2005-08-16 01:43:50 |
| Message-ID: | 20050816014350.56816.qmail@web52906.mail.yahoo.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-general |
--- Michael Fuhr <mike(at)fuhr(dot)org> wrote:
> On Mon, Aug 15, 2005 at 01:48:00PM -0700, CSN wrote:
> > db=>select ascii('');
> > ascii
> > -------
> > 226
> >
> > db=>select id from news where body ilike '%%';
> > (0 rows)
> >
> > db=>select id from news where body ilike '%' ||
> > chr(226) || '%';
> > db'>
> > db'>^C
> > db=>
>
> What's going on with the last query? The prompt
> change suggests
> that psql is confused with quoting, and the ^C looks
> like you hit
> Control-C to get the regular prompt back. Did you
> ever run this
> query? If it produced no rows then you could widen
> the search.
Hmm, I'm on another computer and I just tried that
last query and it worked without psql thinking it
needed another single quote. Appears the chr code is
146 not 226 (turns out chr(226) is â - why that
doesn't cause problems with iso-8859-1/utf-8 xml and
the single/double quotes and dashes do I don't know).
Anyhow, I ended up doing this:
update news set body=replace(body,chr(146),''''); --
left single quote
update news set body=replace(body,chr(145),''''); --
right single quote
update news set body=replace(body,chr(147),'"'); --
left double quote
update news set body=replace(body,chr(148),'"'); --
right double quote
update news set body=replace(body,chr(150),'-'); -- en
dash
update news set body=replace(body,chr(151),'-'); -- em
dash
and that seems to do the trick. Most places I found
online listed different chars for these codes, but
http://www.webopedia.com/quick_ref/asciicode.asp lists
them. Jeez, I'm so confused with encodings, charsets,
etc. now. :(
Thanks,
CSN
> Example:
>
> SELECT id FROM news WHERE body ~ '[\200-\377]';
>
> You could use the "string from pattern" variant of
> substring() to
> extract characters in a specific range. If you have
> PL/Perl then
> it would be trivial to extract all of and only the
> special characters
> along with their ASCII codes:
>
> CREATE FUNCTION special_chars(text) RETURNS text AS
> '
> return join(" ", map {"$_:" . ord($_)} $_[0] =~
> /[\200-\377]/g);
> ' LANGUAGE plperl IMMUTABLE STRICT;
>
> SELECT id, special_chars(body) FROM news WHERE body
> ~ '[\200-\377]';
>
> --
> Michael Fuhr
>
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Christopher Browne | 2005-08-16 01:47:00 | Re: do separate databases have any impact each other? |
| Previous Message | Andrew Dunstan | 2005-08-16 01:33:54 | Re: Testing of MVCC |