Re: Regex Replace with 2 conditions

From: Francisco Olarte <folarte(at)peoplecall(dot)com>
To: Denisa Cirstescu <Denisa(dot)Cirstescu(at)tangoe(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Regex Replace with 2 conditions
Date: 2018-02-05 14:26:46
Message-ID: CA+bJJbz2QG8Wg037OBxr0TPg1+-7YkwK5ikpY7tAoETCXmOZKw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Denisa:

On Mon, Feb 5, 2018 at 2:34 PM, Denisa Cirstescu
<Denisa(dot)Cirstescu(at)tangoe(dot)com> wrote:
> I need an SQL function that eliminates all ASCII characters from 1-255 that
> are not A-Z, a-z, 0-9, and special characters % and _ so something like:

Are you aware ASCII is a SEVEN bit code ?

And now, why don't you just write the negated condition, maybe
throwing in a null to avoid it? Do you have codes above 255 which you
do not need replacing?

I.e., something like

SELECT regexp_replace(p_string, E'[^A-Za-z0-9%_]', '', 'g'));

This will also zap \0 and all chars >255 if you are using unicode, if
this is not a problem that's all there is to it.

If you are using it you could throw a null plus a character range from
256 to the largest one, but I doubt this is useful. Which is the
character set of your source data? ( It can NOT be ascii if you are
worried about 128-255, but is it a single byte one or is it unicode or
something wide ? )

Also, it may perform a bit faster if you throw a + after the character
class ( for >1 char runs ).

Francisco Olarte.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2018-02-05 14:43:06 Re: Regex Replace with 2 conditions
Previous Message Denisa Cirstescu 2018-02-05 13:34:37 Regex Replace with 2 conditions