Re: regexp_replace usage

From: Michael Fuhr <mike(at)fuhr(dot)org>
To: chester c young <chestercyoung(at)yahoo(dot)com>
Cc: sql pgsql <pgsql-sql(at)postgresql(dot)org>
Subject: Re: regexp_replace usage
Date: 2006-10-02 03:56:09
Message-ID: 20061002035609.GA42444@winnie.fuhr.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

On Fri, Sep 29, 2006 at 02:31:12PM -0700, chester c young wrote:
> column name in table bue has miscapitalized Mc names, eg, 'John Mcneil'
> instead of 'John McNeil'.
>
> (this should be easy but) how do you construct the update query?
>
> also, regexp_string( 'Mcneil', 'Mc(.*)', initcap('\\1') ) => 'neil'
> _not_ Neil' - is this correct?

I assume you mean regexp_replace() as indicated in the subject.
You're calling initcap() on the literal value '\\1'; the result,
which is the same string, is then passed to regexp_replace(), so
in effect you're calling

regexp_replace('Mcneil', 'Mc(.*)', '\\1')

Offhand I can't think of a way to do what you want with regexp_replace()
but you could use PL/Perl. Something like this should work:

CREATE FUNCTION mcfix(text) RETURNS text AS $$
$_[0] =~ s/\bMc([a-z])/Mc\u$1/g;
return $_[0];
$$ LANGUAGE plperl IMMUTABLE STRICT;

SELECT mcfix('John Mcneil');
mcfix
-------------
John McNeil
(1 row)

You could use

$_[0] =~ s/\b(Ma?c)([a-z])/$1\u$2/g;

to change Mcdonald and Macdonald into McDonald and MacDonald,
respectively. However, since both Macdonald and MacDonald are used,
determining which is correct would be impossible.

--
Michael Fuhr

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message chester c young 2006-10-02 15:15:03 Re: regexp_replace usage
Previous Message Ragnar 2006-09-30 22:29:53 Re: tree-structured query