Re: BUG #13538: REGEX non-greedy is working incorrectly (and also greedy matches fail if non-greedy is present)

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: "christian_maechler(at)hotmail(dot)com" <christian_maechler(at)hotmail(dot)com>
Cc: "pgsql-bugs(at)postgresql(dot)org" <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #13538: REGEX non-greedy is working incorrectly (and also greedy matches fail if non-greedy is present)
Date: 2015-08-04 05:50:06
Message-ID: CAKFQuwaXQvXx9OEDNXAA5eXdbP0CKBnWow+L0aoh1vH2J6LrtQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Monday, August 3, 2015, <christian_maechler(at)hotmail(dot)com> wrote:

> The following bug has been logged on the website:
>
> Bug reference: 13538
> Logged by: Chris Mächler
> Email address: christian_maechler(at)hotmail(dot)com <javascript:;>
> PostgreSQL version: 9.3.0
> Operating system: ?
> Description:
>
> Here is an example to verify and reproduce the error (extract a number and
> the things before and after it with 3 groups):
>
>
> '(.*)([+-]?[0-9]*\.[0-9]+)(.*)'
>
> Using regexü_matches this will produce an undesirable result (only one
> digit
> in group 2), but everything behaves correctly, the third group matches
> until
> the end.
>
> '(.*?)([+-]?[0-9]*\.[0-9]+)(.*)'
>
> If we change the first group to non-greedy to fix this, then the bug
> appears: the third group becomes non-greedy too (it shouldn't!) and
> therefore it is always empty instead of matching until the end of the line.
> Also the first group is empty (should match from start!), it should find a
> match at start position, whether it is non-greedy or not and not look ahead
> if the non-greedy group can be reduced if starting to match at the next
> index. Both are wrong behaviors.
>
> (the workaround is anchoring, but the behavior of the regex is still wrong)
>
> link: http://sqlfiddle.com/#!15/f0f14/14
>
>
>
Reading the documentation this seems to be working as intended.

http://www.postgresql.org/docs/9.3/static/functions-matching.html#POSIX-MATCHING-RULES

On what are you basing your concept of correctness? Specifically, what
language implementation do you consider "right"?

The TCL implementation used by PostgreSQL has some differences compared to
Java and Perl, the two I am most familiar with.

David J.

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Heikki Linnakangas 2015-08-04 07:16:55 Re: [BUGS] BUG #13536: SQLParamData thows "Invalid Endian" error
Previous Message christian_maechler 2015-08-03 22:44:50 BUG #13538: REGEX non-greedy is working incorrectly (and also greedy matches fail if non-greedy is present)