From: | Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | "Daniel Verite" <daniel(at)manitou-mail(dot)org>, "Robert Schreiber" <bobschreiber(at)charter(dot)net>, pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: PostgreSQL 9.3.5 substring(text from pattern for escape) bug |
Date: | 2019-05-12 03:19:23 |
Message-ID: | 87mujsgz14.fsf@news-spur.riddles.org.uk |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
>>>>> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
>>> What appears to be happening is that the _Q? on the tail_ is being
>>> matched, but _the Q is also being included in the collected data.__
>> You seem to expect that % must be non-greedy and let the final Q?
>> match 1 Q instead of 0, but there doesn't appear to be anything in
>> the doc that supports this interpretation. In fact, it mentions that
>> "%" is comparable to ".*" in POSIX regular expressions, and the
>> latter _is_ greedy.
Tom> Right. You could get the behavior you want using a non-greedy
Tom> quantifier, but you'd have to use the POSIX regexp functions, not
Tom> substring().
I looked up the spec on this point. As far as I can see, we're not
following it, but neither does the spec do what the OP wanted; in fact
the result should have included the _leading_ Q as well as the trailing
one.
The relevant part of SQL2016 seems to be this:
6.32 <string value function>
General Rules
6) If <regular expression substring function> is specified, then:
[...rules that split the pattern into 'R1#"R2#"R3' omitted...]
h) Otherwise, the result S of the <regular expression substring
function> is computed as follows:
i) Let S1 be the shortest initial substring of C such that there is
a substring S23 of C such that the value of the following <search
condition> is True:
'C' = 'S1' || 'S23' AND
'S1' SIMILAR TO 'R1' ESCAPE 'E' AND
'S23' SIMILAR TO '(R2R3)' ESCAPE 'E'
ii) Let S3 be the shortest final substring of S23 such that there is
a substring S2 of S23 such that the value of the following
<search condition> is True:
'S23' = 'S2' || 'S3' AND
'S2' SIMILAR TO 'R2' ESCAPE 'E' AND
'S3' SIMILAR TO 'R3' ESCAPE 'E'
iii) The result of the <regular expression substring function> is S2.
--
Andrew (irc:RhodiumToad)
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2019-05-12 03:50:43 | Re: PostgreSQL 9.3.5 substring(text from pattern for escape) bug |
Previous Message | Tom Lane | 2019-05-11 15:59:21 | Re: PostgreSQL 9.3.5 substring(text from pattern for escape) bug |