Re: Why this regexp matches?!

From: David Johnston <polobo(at)yahoo(dot)com>
To: "depesz(at)depesz(dot)com" <depesz(at)depesz(dot)com>
Cc: Szymon Guz <mabewlun(at)gmail(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Why this regexp matches?!
Date: 2012-02-04 16:28:50
Message-ID: 8887FCC5-2FC2-4609-B1FC-11EB81F01B86@yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Feb 4, 2012, at 3:58, hubert depesz lubaczewski <depesz(at)depesz(dot)com> wrote:

> On Sat, Feb 04, 2012 at 09:54:34AM +0100, Szymon Guz wrote:
>> On 4 February 2012 09:46, hubert depesz lubaczewski <depesz(at)depesz(dot)com>wrote:
>>
>>> select 'depesz depeszx depesz' ~ E'^(.*)( \\1)+$';
>>>
>>> what's worse:
>>> $ select regexp_replace( 'depesz depeszx depesz', E'^(.*)( \\1)+$', E'\\1'
>>> );
>>> regexp_replace
>>> ────────────────
>>> depesz
>>> (1 row)
>>>
>>> I know that Pg regexps are limited, but even grep's regexps match this
>>> correctly:
>>>
>>> =$ printf 'depesz depesz depesz\ndepesz depeszx depesz\n' | grep -E
>>> '^(.*)( \1)+$';
>>> depesz depesz depesz
>>>
>>> Best regards,
>>>
>>> depesz
>>>
>>>
>> Hi,
>> some time ago I hit the same problem, however the solution was a little bit
>> tricky. I didn't have time to investigate it, but this works:
>>
>> postgres(at)postgres:5840=# select regexp_replace( 'depesz depeszx depesz',
>> E'^(.*)( \\\\1)+$', E'\\\\1' );
>> regexp_replace
>> -----------------------
>> depesz depeszx depesz
>> (1 row)
>
> not sure if I understand your point.
>
> This regexp was meant to find repeated substrings.
>
> Like this one does in perl:
>
> /^(.*)( \1)+$/
>
> We can see how it works with:
> =$ perl -e 'if ( shift =~ m/^(.*)( \1)+$/ ) { print "is repeat of [$1]\n" } else {print "is not repeated\n"}' 'depesz depesz depesz'
> is repeat of [depesz]
>
> =$ perl -e 'if ( shift =~ m/^(.*)( \1)+$/ ) { print "is repeat of [$1]\n" } else {print "is not repeated\n"}' 'depesz depeszx depesz'
> is not repeated
>
> reason why your regexp matches is also a mystery for me.
>
> Best regards,
>
> depesz
>
>

Don't know the answer (if there is one other than 'it's a bug') but as a workaround you can split the string on whitespace then perform grouping and see if more than one record results...

David J.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2012-02-04 17:58:21 Re: debugging the server[ module causes server cash]
Previous Message mike@trausch.us 2012-02-04 14:33:12 Re: debugging the server[ module causes server cash]