Re: Regular expression question with Postgres

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Mike Christensen <mike(at)kitchenpc(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Regular expression question with Postgres
Date: 2014-07-24 20:57:07
Message-ID: 18739.1406235427@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Mike Christensen <mike(at)kitchenpc(dot)com> writes:
> I'm curious why this query returns 0:
> SELECT 'AAA' ~ '^A{,4}$'

> Yet, this query returns 1:

> SELECT 'AAA' ~ '^A{0,4}$'

> Is this a bug with the regular expression engine?

Our regex documentation lists the following variants of bounds syntax:
{m}
{m,}
{m,n}
Nothing about {,n}. I rather imagine that the engine is deciding that
that's just literal text and not a bounds constraint ...

regression=# SELECT 'A{,4}' ~ '^A{,4}$';
?column?
----------
t
(1 row)

... yup, apparently so.

A look at the POSIX standard says that it has the same idea of what
is a valid bounds constraint:

When an ERE matching a single character or an ERE enclosed in
parentheses is followed by an interval expression of the format
"{m}", "{m,}", or "{m,n}", together with that interval expression
it shall match what repeated consecutive occurrences of the ERE
would match. The values of m and n are decimal integers in the
range 0 <= m<= n<= {RE_DUP_MAX}, where m specifies the exact or
minimum number of occurrences and n specifies the maximum number
of occurrences. The expression "{m}" matches exactly m occurrences
of the preceding ERE, "{m,}" matches at least m occurrences, and
"{m,n}" matches any number of occurrences between m and n,
inclusive.

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Mike Christensen 2014-07-24 21:05:12 Re: Regular expression question with Postgres
Previous Message Mike Christensen 2014-07-24 20:51:43 Re: Regular expression question with Postgres