From: | "Markhof, Ingolf" <ingolf(dot)markhof(at)de(dot)verizon(dot)com> |
---|---|
To: | Francisco Olarte <folarte(at)peoplecall(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Postgres General <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: [E] Re: Regexp_replace bug / does not terminate on long strings |
Date: | 2021-08-23 14:29:23 |
Message-ID: | CALZg0g5VBbzjY3G=TTFM4E8Yd0-vEAQftkQ=8PGkdbwE-TbUHA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
You are right, I also found the same behaviour when using e.g the UNIX sed
command.
Ingolf
On Mon, Aug 23, 2021 at 4:24 PM Francisco Olarte <folarte(at)peoplecall(dot)com>
wrote:
> Ingolf:
>
> On Mon, Aug 23, 2021 at 2:39 PM Markhof, Ingolf
> <ingolf(dot)markhof(at)de(dot)verizon(dot)com> wrote:
> > Yes, When I use (\1)? instead of (\1)+, the expression is evaluated
> quickly, but it doesn't return what I want. Once a word is written, it is
> not subject to matching again. i.e.
> > select regexp_replace( --> remove double entries
> > 'one,one,one,two,two,three,three',
> > '([^,]+)(,\1)?($|,)',
> > '\1\3',
> > 'g'
> > ) as res;
> >
> ...
> > Honestly, this behaviour seems to be incorrect for me. Once the system
> replaces the first two 'one,one,' by a single 'one,', I'd expect to match
> this replaced one 'one,' with the next 'one,' following, replacing these
> two by another, single 'one,', again...
>
> I think your expectation is misguided. All the regexp engines I've
> used do it this way, when asked to match "g"lobally they do
> non-overlapping matches, they do not substitute and recurse with the
> modified string.
>
> Also, your way opens the door to run-away or infinite loops (
> rr('a','a','aa','g') or rr('a','a','a','g'), not to speak of
> r('x','','','g') ). Even a misguided r(str, '_+','_','g'), used
> sometimes to normalize space runs and similar things, can go into a
> loop.
>
> Francisco Olarte.
>
======================================================================
Verizon Deutschland GmbH - Sebrathweg 20, 44149 Dortmund, Germany - Amtsgericht Dortmund, HRB 14952 - Geschäftsführer: Detlef Eppig - Vorsitzender des Aufsichtsrats: Francesco de Maio
From | Date | Subject | |
---|---|---|---|
Next Message | Adrian Klaver | 2021-08-23 15:00:22 | Re: Multiple Postgres process are running in background |
Previous Message | Francisco Olarte | 2021-08-23 14:21:37 | Re: [E] Re: Regexp_replace bug / does not terminate on long strings |